ineptr2 is a ground-up rewrite of ineptR. This article summarises what changed and why.
INE API V2
ineptr2 targets version 2 of the INE API, which introduced richer dimension filtering while remaining backward-compatible with V1 queries. The key additions:
-
Dim1=T(all periods) — setting dimension 1 value toTretrieves all available reference periods, instead of having to perform one call for each period explicitly. -
Comma-separated values — multiple values in a
single dimension parameter (e.g.
Dim1=S7A2019,S7A2020), enabling multi-value queries without multiple API calls. -
Level-based selection (
lvl@N) — retrieve all entries at a given hierarchical level (e.g.lvl@5on the geographic dimension returns all municipalities). -
Classification subtree (
<*>CODE) — retrieve all children under a classification node (e.g.<*>200on the geographic dimension returns a NUTS III - Madeira region and all its municipalities). - Up to 1 million rows - limit greatly increased from 40 000 to 1 000 000 returned rows per call.
These features let ineptr2 build smarter, more compact URL grids —
which directly reduces the number of HTTP requests needed for filtered
queries. For example, the full 0008206 indicator now
requires 66 calls, compared to over 2000 calls with V1.
R6 client instead of standalone functions
ineptR exported five standalone functions
(get_ine_data(), get_metadata(),
get_dim_info(), get_dim_values(),
is_indicator_valid()). Every call was stateless — you had
to pass lang each time. Caching was not implemented:
# ineptR — repeat lang on every call
df <- get_ine_data("0008273", lang = "EN")
meta <- get_metadata("0008273", lang = "EN")
dims <- get_dim_values("0008273", lang = "EN")ineptr2 wraps everything in a single R6 object. Configure once, call methods:
# ineptr2 — configure once, use everywhere
ine <- INEine$new(lang = "EN")
df <- ine$get_data("0008273")
meta <- ine$get_metadata("0008273")
dims <- ine$get_dim_values("0008273")The are now more concise (now: get_data; before: get_ine_data) and coherent (now: get_data, get_metadata; before: get_ine_data, get_metadata).
Settings are live — change ine$lang mid-session and
every subsequent call picks it up, with validation built in
(e.g. lang only accepts "PT" or
"EN", row_limit is capped at the API
maximum).
Smaller footprint
ineptR pulled in 11 packages at install time:
dplyr, httr, httr2, lifecycle, magrittr, progressr, purrr, readr, stringr, tibble, tidyr
ineptr2 needs only 4 packages:
httr2, jsonlite, R6, xml2
All data wrangling now uses base R and the entire tidyverse chain is gone. The package installs faster and has a smaller footprint.
Smarter chunking
The INE API limits each response to a maximum number of rows. ineptR
implemented a maximum of 40 000 rows per request
(max_cells), which meant that large indicators required
hundreds or thousands of small API calls.
ineptr2 defaults to a much higher limit
(row_limit = 1 000 000) and splitting logic is improved
(Dimension 1 now accepts multiple values per call). The result is far
fewer HTTP requests for the same data. You can preview the chunk plan
before committing to a download:
ine$preview_chunks("0008206")File-based caching
ineptR had no caching — every call hit the API fresh, and all data lived in memory. When downloading large indicators, the user was responsible for the cycle get some data - save to disk - get more data.
ineptr2 introduces a chunk-based file cache with three layers:
- Chunk cache — raw API responses stored as individual JSON files on disk
- Data cache — processed data stored as RDS, with dimension filters tracked so the cache is only reused when the new request is a subset of what was cached
-
Metadata cache — used by
is_updated()to detect changes without a full download
Caching is enabled via ine$use_cache <- TRUE or at
construction time, and the cache directory is configurable. See the How
caching works article for a detailed walkthrough.
Download and load as separate steps
ineptR always downloaded and processed data in one go via
get_ine_data(). For large indicators this meant holding
everything in memory at once and risking out-of-emory errors.
ineptr2 separates the two concerns:
| Step | Method | What it does |
|---|---|---|
| Download | ine$download_data() |
Streams chunks to disk, nothing held in memory |
| Load | ine$load_raw_data() |
Reads cached raw chunks back into R. |
| Both | ine$get_data() |
Convenience wrapper that downloads, loads and cleans data |
This means you can download overnight, close R, and load the results later.
Resume support
If a download is interrupted (network drop, session crash, timeout),
ineptr2 picks up where it left off. Each download creates a JSON
manifest tracking which chunks succeeded. Calling
download_data() again skips completed chunks
automatically.
Full indicator catalog
ineptr2 exposes methods to download
(ine$download_catalog()) and load
(ine$get_catalog()) the full indicator catalog with over 13
000 indicators.
New methods
ineptr2 adds several methods that had no equivalent in ineptR:
-
ine$info()— prints a human-readable summary of an indicator: name, periodicity, time range, dimensions and their sizes. -
ine$is_updated()— compares cached metadata against the live API to check if data has been updated since your last download. -
ine$preview_chunks()— shows how many API calls a download will require, before you commit to it. -
ine$get_catalog()/ine$download_catalog()— retrieve the full INE indicator catalog. -
ine$list_cached()— lists all indicators currently in the file cache. -
ine$clear_cache()— clears cached files for one or all indicators.
Summary
| ineptR | ineptr2 | |
|---|---|---|
| Interface | 5 standalone functions | R6 client object |
| Dependencies | 11 | 4 |
| Rows per API request | 40 000 (max) | 1 000 000 (max) |
| File-based caching | No | Chunk, data, and metadata layers |
| Download without loading to memory | No |
download_data() + load_raw_data() /
get_data()
|
| Resume interrupted downloads | No | Automatic via manifest |
| Preview chunk plan | No | preview_chunks() |
| Indicator catalog | No |
get_catalog() / download_catalog()
|
