Pagination in REST APIs

Pagination-in-REST-APIs-Feature-Image.jpg

When you are working with a large dataset, understanding pagination in REST APIs is important to avoid overloading the system with a huge load of data in a single request. Power Query in Power BI can be an excellent tool to query and process such paginated API responses, making REST API pagination in Power BI seamless and efficient. Pagination is essential when handling large datasets in Power BI to avoid performance issues and memory overload with the help of parameters like page, offset, or limit. In this blog, let us explore how paginated API responses with Power Query in Power BI are handled.

Table of Contents:

What is Pagination in REST APIs?

A paginated response is a technique used by APIs to split large datasets into smaller ones. The API does not return all data in a single response; it returns a limited set of results along with metadata of the next set of results so that you can fetch more data if needed.

The pagination metadata includes:

  • Current page number
  • Total pages
  • Next page URL

Advantages of Pagination

  • Performance Optimization: Large datasets in a table are maintained by pagination, as it prevents slow performance that occurs while handling large data in one request.
  • Managing Server Load: By dividing large data into smaller chunks, the server can manage requests more easily, as it reduces the risk of the system being overloaded.
  • Improved User Experience: Data pagination allows faster loading times and smoother user interaction, which improves user experience.
  • Memory Efficiency: Pagination helps to reduce memory usage by loading smaller sets of data at a time.
  • Reduced Network Traffic: Paginated responses minimize the amount of data transferred in a single request, which reduces network usage and improves response time

Querying RESTful APIs with Power Query

Power Query API integration allows you to fetch, transform, and clean the data from paginated APIs directly into Power BI. The most common use of Power Query is querying RESTful APIs. When you have a large dataset, these APIs return data in a paginated format. Power BI can handle such a paginated response very well by allowing you to process large amounts of data.

Custom Power Query Paging Function

Using Power Query, you can handle a paginated REST API Power Query response efficiently by creating a custom paging function.

To understand this function step-by-step, we will consider an API example from the U.S. Treasury Reporting Rates of Exchange.

Creating The GetPage Function

We will create a custom function named GetPage that takes a page number as an argument and fetches the data for that page.

1. Open Power Query Editor: Click on the “Home” tab and then click “Transform Data” to open Power Query Editor.

Open Power Query Editor Output

2. Create a Blank Query: Go to the “Home” tab and then click “New Source”> “Blank Query”.

Create a Blank Query Output

3. Open Advanced Editor: Click on the “Advanced Editor” in the home tab to write custom code.

Open Advanced Editor Output

4. Write the code: Paste the following code on the tab that will appear after clicking on Advanced Editor and name it GetPage.

(page as text) =>
  let
    Source = try
      Json.Document(
        Web.Contents(
          "https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/rates_of_exchange?page[number]="
            & page
            & "&fields=country_currency_desc,exchange_rate,record_date"
        )
      )
    otherwise
      [data = null, meta = null, links = null], 
    #"Converted to Table" = Table.FromRecords({Source}), 
    #"Removed Columns" = Table.RemoveColumns(#"Converted to Table", {"meta", "links"}), 
    #"Expanded data List" = Table.ExpandListColumn(#"Removed Columns", "data"), 
    #"Expanded data Records" = Table.ExpandRecordColumn(
      #"Expanded data List", 
      "data", 
      {"country_currency_desc", "exchange_rate", "record_date"}, 
      {"data.country_currency_desc", "data.exchange_rate", "data.record_date"}
    )
  in
    #"Expanded data Records"
Write the code Output

This GetPage function helps you fetch paginated data, Power BI style, page by page, ensuring no records are missed.

Here are some API pagination examples that will help you fetch all pages from a REST API using Power Query.

Methods to Retrieve Pages

Now, we have a function called GetPage, and we can retrieve all pages by iterating through the pages.

Method 1: Page Count is Known

To loop through API pages, Power Query provides several methods depending on the metadata available from the API.

Steps to retrieve all pages:

  • Open the Power Query Editor
  • Create a Blank Query
  • Paste the following code in the Advanced Editor

Code for Method 1:

let
  Source = try
    Json.Document(
      Web.Contents(
        "https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/rates_of_exchange?fields=country_currency_desc,exchange_rate,record_date"
      )
    )
  otherwise
    [data = null, meta = null, links = null], 
  #"Page List" = 
    if Source = null or Source[meta] = null then
      {null}
    else
      {1 .. Source[meta][#"total-pages"]}, 
  #"Converted to Table" = Table.FromList(
    #"Page List", 
    Splitter.SplitByNothing(), 
    null, 
    null, 
    ExtraValues.Error
  ), 
  #"Renamed Columns" = Table.RenameColumns(#"Converted to Table", {{"Column1", "PAGES"}}), 
  #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns", {{"PAGES", type text}}), 
  #"Invoked Custom Function" = Table.AddColumn(
    #"Changed Type", 
    "GetPage", 
    each if [PAGES] <> null then GetPage([PAGES]) else null
  ), 
  #"Expanded GetPage" = Table.ExpandTableColumn(
    #"Invoked Custom Function", 
    "GetPage", 
    {"data.country_currency_desc", "data.exchange_rate", "data.record_date"}, 
    {"GetPage.data.country_currency_desc", "GetPage.data.exchange_rate", "GetPage.data.record_date"}
  )
in
  #"Expanded GetPage"

After running the query, you will get a table containing all the data from both tables.

Output:

Code for Method 1 Output

Method 2: Page Count is Unknown; Entry Count is Known

If the total number of entries and entries per page are given but the total number of pages is not provided, then we can calculate the number of pages.

let
  Source = try
    Json.Document(
      Web.Contents(
        "https://api.fiscaldata.treasury.gov/services/api/fiscal_service/v1/accounting/od/rates_of_exchange?fields=country_currency_desc,exchange_rate,record_date"
      )
    )
  otherwise
    [data = null, meta = null, links = null], 
  #"Page List" = 
    if Source = null or Source[meta] = null then
      {null}
    else
      {1 .. Number.RoundUp(Source[meta][#"total-count"] / Source[meta][#"count"])}, 
  #"Converted to Table" = Table.FromList(
    #"Page List", 
    Splitter.SplitByNothing(), 
    null, 
    null, 
    ExtraValues.Error
  ), 
  #"Renamed Columns" = Table.RenameColumns(#"Converted to Table", {{"Column1", "PAGES"}}), 
  #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns", {{"PAGES", type text}}), 
  #"Invoked Custom Function" = Table.AddColumn(
    #"Changed Type", 
    "GetPage", 
    each if [PAGES] <> null then GetPage([PAGES]) else null
  ), 
  #"Expanded GetPage" = Table.ExpandTableColumn(
    #"Invoked Custom Function", 
    "GetPage", 
    {"data.country_currency_desc", "data.exchange_rate", "data.record_date"}, 
    {"GetPage.data.country_currency_desc", "GetPage.data.exchange_rate", "GetPage.data.record_date"}
  )
in
  #"Expanded GetPage"

Output:

Entry Count is Known Output

Method 3: Both Page Count and Entry Count are Unknown

In this case, neither the total number of pages nor entries is provided, so we will use List. Generate to fetch pages until no data is returned.

let
  Source = List.Generate(
    () => [PAGE = 1, RESULTS = GetPage("1")], 
    each Table.RowCount(
      Table.SelectRows(_[RESULTS], each not List.Contains(Record.ToList(_), null))
    )
      > 0, 
    each [PAGE = _[PAGE] + 1, RESULTS = GetPage(Number.ToText([PAGE] + 1))], 
    each _[[PAGE], [RESULTS]]
  ), 
  #"Converted to Table" = Table.FromList(
    Source, 
    Splitter.SplitByNothing(), 
    null, 
    null, 
    ExtraValues.Error
  ), 
  #"Expanded Column" = Table.ExpandRecordColumn(
    #"Converted to Table", 
    "Column1", 
    {"PAGE", "RESULTS"}, 
    {"PAGES", "GetPage"}
  ), 
  #"Changed Type" = Table.TransformColumnTypes(#"Expanded Column", {{"PAGES", type text}}), 
  #"Expanded GetPage" = Table.ExpandTableColumn(
    #"Changed Type", 
    "GetPage", 
    {"data.country_currency_desc", "data.exchange_rate", "data.record_date"}, 
    {"GetPage.data.country_currency_desc", "GetPage.data.exchange_rate", "GetPage.data.record_date"}
  )
in
  #"Expanded GetPage"

Output:

Both Page Count and Entry Count are Unknown Output

Performance Comparison

Method Performance Efficiency Best for
Method 1: Page Count is knownFastest  Performance.Directly uses total-pages metadata for looping.Best for APIs that provide the total pages.
Method 2: Entry Count is knownPerformance is slightly slower than Method 1.Calculates total pages using total-count and count.Best for APIs that provide the total count but not the total pages.
Method 3: Both Counts are UnknownPerformance is Slowest.Dynamically generates pages, checking for data presence.Best for APIs that do not provide a total count or total pages.

Best Practices

  • Check Available Metadata: Check the API’s documentation to check if metadata is present.
  • Handle API Limits and Errors: Make sure you implement error handling and retries in Power Query if your API has limits. You can use try… otherwise expressions to catch and handle errors.
  • Optimize Performance for Queries: Request larger pages to reduce the total number of requests in order to minimize the number of API calls. Performance test when scaling up larger datasets.
  • Caching response in API: When you are using an API that does not change frequently, consider caching data locally or storing it in a database to minimize requests.

Conclusion

Power Query offers several solutions while handling paginated data, each solution having its benefits. Method 1 (Page Count is Known) is the most efficient when the page count is available, while Method 2 (Page Count is unknown and Entry Count is known) works well when you only have the total entry count. Method 3 ( Both Counts are Unknown) is the slowest and should be used when no metadata is available.

To learn more about Power BI and its functions, check out this Power BI Course and also explore Power BI Interview Questions prepared by industry experts.

Handling Paginated REST API Data – FAQs

1. What is paginated data?

Paginated data means dividing the larger datasets into smaller ones in order to avoid the rush at the server or client side.

2. Why do APIs paginate responses?

APIs paginate responses to manage large datasets efficiently and reduce server load by returning data in smaller parts.

3. Can Power Query automatically retrieve all pages from an API?

Yes, Power Query automatically retrieves all pages from an API by using custom functions and looping methods.

4. What is the 'GetPage' function in Power Query?

The “GetPage” is a custom Power Query function used to fetch individual pages of data.

5. What happens if the page count is unknown?

If the page count is unknown, then you can calculate the number of pages based on the total entries.

About the Author

Technical Writer | Business Analyst

Yash Vardhan Gupta is an expert in data and business analysis, skilled at turning complex data into clear and actionable insights. He works with tools like Power BI, Tableau, SQL, and Markdown to develop effective documentation, including SRS and BRD. He helps teams interpret data, make informed decisions, and drive better business outcomes. He is also passionate about sharing his expertise in a simple and understandable way to help others learn and apply it effectively.

Data Analytics for Business