AbstractRAQSAPI is a package for R that connects the R programming language environment to the United States Environmental Protection Agency’s (US EPA) Air Quality System (AQS) Data Mart database API for retrieval of ambient air pollution data.
This software/application was developed by the U.S. Environmental Protection Agency (USEPA). No warranty expressed or implied is made regarding the accuracy or utility of the system, nor shall the act of distribution constitute any such warranty. The USEPA has relinquished control of the information and no longer has responsibility to protect the integrity, confidentiality or availability of the information. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the USEPA. The USEPA seal and logo shall not be used in any manner to imply endorsement of any commercial product or activity by the USEPA or the United States Government.
| Warning: US EPA’s AQS Data Mart API V2 is currently
in beta phase of development, the API interface has not been finalized.
This means that certain functionality of the API may change or be removed
without notice. As a result, this package is also currently marked as beta and
may also change to reflect any changes made to the Data Mart API or in respect
to improvements in the design, functionality, quality and documentation of
this package. The authors assume no liability for any problems that may occur
as a result of using this package, the Data Mart service, any software,
service, hardware, or user accounts that may utilize this package.
The RAQSAPI package for the R programming environment allows a R programming environment to connect to and retrieve data from the United States Environmental Protection Agency’s (US EPA) Air Quality System (AQS) Data Mart API v2 1 interface directly. This package enables the data user to omit legacy challenges including coercing data from a JSON object to a usable R object, retrieving multiple years of data, formatting API requests, retrieving results, handling credentials, requesting multiple pollutant data and rate limiting data requests. All the basic functionality of the API have been implemented that are available from the AQS API Data Mart server. The library connects to AQS Data Mart API via Hypertext Transfer Protocol (HTTP) so there is no need to install external ODBC drivers, configure ODBC connections or deal with the security vulnerabilities associated with them. Most functions have a parameter, return_header which by default is set to FALSE. If the user decides to set return_header to TRUE, then that function will return a R AQS_DATAMART_APIv2 S3 object which is a two item named list.
The first item, ($Header) in the AQS_DATAMART_APIv2 object is a tibble 2 which contains the header information. The Header contains status information regarding the request (success/fail), any applicable error messages returned from the API, if any exist, the URL used in the request, a date and time stamp noting when request was received and other useful information. The second item of the AQS_DATAMART_APIv2 object ($Data) is a tibble which contains the actual data being requested. For functions with the return_header option set to FALSE (default) a simple tibble is returned with just the $Data portion of the request. After each call to the API a five second stall is invoked to help prevent overloading the Data Mart API server and to serve as a simple rate limit. 3
Either install the stable version from CRAN or install the latest development version from GitHub.
install.packages(pkgs="RAQSAPI", dependencies = TRUE )
to install the development version of RAQSAPI, first If not already installed, install the remotes package and its dependencies. Then run the following in an R environment.
::install_github(repo = "USEPA/raqsapi", remotesdependencies = TRUE, upgrade = "always", build = TRUE, #optional, if you want the manual, requires pandoc build_manual = FALSE, build_vignettes = TRUE )
after successfully installing the RAQSAPI package, load the RAQSAPI library:
If you have not already done so you will need to sign up with AQS Data Mart using aqs_sign_up function, 4 this function takes one input, “email”, which is a R character object, that represents the email address that you want to use as a user credential to the AQS Data Mart service. After a successful call to aqs_sign_up an email message will be sent to the email address provided with a new Data Mart key which will be used as a credential key to access the Data Mart API. The aqs_sign_up function can also be used to regenerate a new key for an existing user, to generate a new key simply call the aqs_sign_up function with the parameter “email” set to an existing account. A new key will be e-mailed to the account given.
The credentials used to access the Data Mart API service are stored in an R environment variable that needs to be set every time the RAQSAPI library is attached or the key is changed. Without valid credentials, the Data Mart server will reject any request sent to it. The key used with Data Mart is a key and is not a password, so the RAQSAPI library does not treat the key as a password; this means that the key is stored in plain text and there are no attempts to encrypt Data Mart credentials as would be done for a username and password combination. The key that is supplied to use with Data Mart is not intended for authentication but only account monitoring. Each time RAQSAPI is loaded and before using any of it’s functions use the aqs_credentials 5 function to enter in the user credentials so that RAQSAPI can access the AQS Data Mart server.
| Note: The credentials used to access AQS Data Mart
API are not the same as the credentials used to access AQS. AQS users who do
not have access to the AQS Data Mart will need to create new credentials
This section contains suggestions for completing certain data related tasks.
The AQS Data Mart administrators may disable accounts without notice for failure to adhere to these terms (Though they will contact the offending user via the email address provided)
The RAQSAPI library exports the following functions (in alphabetical order):
aqs_annualsummary_by_box aqs_annualsummary_by_cbsa aqs_annualsummary_by_county aqs_annualsummary_by_site aqs_annualsummary_by_state aqs_cbsas aqs_classes aqs_counties_by_state aqs_credentials aqs_dailysummary_by_box aqs_dailysummary_by_cbsa aqs_dailysummary_by_county aqs_dailysummary_by_site aqs_dailysummary_by_state aqs_fields_by_service aqs_isavailable aqs_knownissues aqs_mas aqs_monitors_by_box aqs_monitors_by_cbsa aqs_monitors_by_county aqs_monitors_by_site aqs_monitors_by_state aqs_parameters_by_class aqs_pqaos aqs_qa_blanks_by_county aqs_qa_blanks_by_MA aqs_qa_blanks_by_pqao aqs_qa_blanks_by_site aqs_qa_blanks_by_state aqs_qa_collocated_assessments_by_county aqs_qa_collocated_assessments_by_MA aqs_qa_collocated_assessments_by_pqao aqs_qa_collocated_assessments_by_site aqs_qa_collocated_assessments_by_state aqs_qa_flowrateaudit_by_county aqs_qa_flowrateaudit_by_MA aqs_qa_flowrateaudit_by_pqao aqs_qa_flowrateaudit_by_site aqs_qa_flowrateaudit_by_state aqs_qa_flowrateverification_by_county aqs_qa_flowrateverification_by_MA aqs_qa_flowrateverification_by_pqao aqs_qa_flowrateverification_by_site aqs_qa_flowrateverification_by_state aqs_qa_one_point_qc_by_county aqs_qa_one_point_qc_by_MA aqs_qa_one_point_qc_by_pqao aqs_qa_one_point_qc_by_site aqs_qa_one_point_qc_by_state aqs_qa_pep_audit_by_county aqs_qa_pep_audit_by_MA aqs_qa_pep_audit_by_pqao aqs_qa_pep_audit_by_site aqs_qa_pep_audit_by_state aqs_removeheader aqs_revisionhistory aqs_sampledata_by_box aqs_sampledata_by_cbsa aqs_sampledata_by_county aqs_sampledata_by_site aqs_sampledata_by_state aqs_sign_up aqs_sites_by_county aqs_states aqs_transactionsample_by_county aqs_transactionsample_by_site aqs_transactionsample_by_state aqs_transactionsample_by_MA
RAQSAPI functions are named according to the service and filter variables that are available by the AQS Data Mart API.6
These are all the available variables that can be used with functions exported from the RAQSAPI library listed alphabetically. * AQSobject: a object that is an AQS_DATAMART_APIv2 S3 object.
bdate: a R date object which represents the begin date of the data selection. Only data on or after this date will be returned.
cbdate (optional): a R date object which represents the “beginning date of last change” that indicates when the data was last updated. cbdate is used to filter data based on the change date. Only data that changed on or after this date will be returned. This is an optional variable which defaults to NA.
cedate (optional): a R date object which represents the “end date of last change” that indicates when the data was last updated. cedate is used to filter data based on the change date. Only data that changed on or before this date will be returned. This is an optional variable which defaults to NA.
countycode: a R character object which represents the 3 digit state FIPS code for the county being requested (with leading zero(s)). Refer to [aqs_counties_by_state()] for the list of available county codes for each state.
edate: a R date object which represents the end date of the data selection. Only data on or before this date will be returned.
email: a R character object which represents the email account that will be used to register with the AQS API or change an existing user’s key. A verification email will be sent to the account specified.
key: the key used in conjunction with the username given to connect to AQS Data Mart.
MA_code: a R character object which represents the 4 digit AQS Monitoring Agency code (with leading zeroes).
maxlat: a R character object which represents the maximum latitude of a geographic box. Decimal latitude with north begin positive. Only data south of this latitude will be returned.
maxlon: a R character object which represents the maximum longitude of a geographic box. Decimal longitude with east being positive. Only data west of this longitude will be returned. Note that -80 is less than -70.
minlat: a R character object which represents the minimum latitude of a geographic box. Decimal latitude with north being positive. Only data north of this latitude will be returned.
minlon: a R character object which represents the minimum longitude of a geographic box. Decimal longitude with east begin positive. Only data east of this longitude will be returned.
parameter: a R character list or single character object which represents the parameter code of the air pollutant related to the data being requested.
return_header If FALSE (default) only returns data requested. If TRUE returns a AQSAPI_v2 object which is a two item list that contains header information returned from the API server mostly used for debugging purposes in addition to the data requested.
sitenum: a R character object which represents the 4 digit site number (with leading zeros) within the county and state being requested.
stateFIPS: a R character object which represents the 2 digit state FIPS code (with leading zero) for the state being requested.
pqao_code: a R character object which represents the 4 digit AQS Primary Quality Assurance Organization code (with leading zeroes).
username: a R character object which represents the email account that will be used to connect to the AQS API.
The functions included in this family of functions are:
These functions are used to sign up with Data Mart and to store credential information to use with RAQSAPI. The RAQSAPI::aqs_signup function takes one parameter:
The RAQSAPI::aqs_credentials function takes two parameters:
These functions return Data Mart meta data
The RAQSAPI::aqs_isavailable function takes no parameters and returns a table which details the status of the AQS API.
The RAQSAPI::aqs_field_by_Service function takes one parameter, service, which is a R character object which represents the services provided by the AQS API. For a list of available services see Air Quality System (AQS) API - Services Overview
The RAQSAPI::aqs_knownissues function takes no parameters and Returns a table of any known issues with system functionality or the data. These are usually issues that have been identified internally and will require some time to correct in Data Mart or the API. This function implements a direct API call to Data Mart and returns data directly from the API. Issues returned via this function do not include any issues from the RAQSAPI R package.
The RAQSAPI::aqs_revisionhistory function is used to query Data Mart for the change history to the API.
aqs_cbsas aqs_classes aqs_counties_by_state aqs_mas aqs_pqaos aqs_sites_by_county aqs_states
List functions return options or groupings that can be used in conjunction with other API calls. By default each function in this category returns results as a tibble. If return_header parameter is set to TRUE a AQSAPI_v2 object is returned instead.
RAQSAPI::aqs_cbsas returns a table of all available Core Based Statistical Areas (cbsas) and their respective cbsa codes.
RAQSAPI::aqs_states takes no arguments and returns a table of the available states and their respective state FIPS codes.
RAQSAPI::aqs_classes takes no arguments and returns a table of parameter classes (groups of parameters, i.e. “criteria” or “all”). The information from this function can be used as input to other API calls.
RAQSAPI::aqs_counties_by_state takes one parameter, stateFIPS, which is a two digit state FIPS code for the state being requested represented as a R character object and returns a table of counties and their respective FIPS code for the state requested. Use RAQSAPI::aqs_states to receive a table of valid state FIPS codes.
RAQSAPI::aqs_sites_by_county takes two parameters, stateFIPS, which is a two digit state FIPS code for the state being requested and county_code which is a three digit county FIPS code for the county being requested, both stateFIPS and county_code should be encoded as a R character object This function returns a table of all air monitoring sites with the requested state and county FIPS code combination.
RAQSAPI::aqs_pqaos takes no parameters and returns an AQS_DATAMART_APIv2 S3 object containing a table of primary quality assurance organizations (pqaos).
RAQSAPI::aqs_mas takes no parameters and returns an AQS_DATAMART_APIv2 S3 object containing a table of monitoring agencies (MA).
| Information: AQS Data Mart API restricts the
maximum amount of monitoring data to one full year of data per
API call. These functions are able to return multiple years of data by
making repeated calls to the API. Each call to the Data Mart API will take
time to complete. The more years of data being requested the longer RAQSAPI
will take to return the results.
These functions retrieve aggregated data from the Data Mart API and are grouped by how each function aggregates the data. There are 5 different families of related aggregate functions. These families are arranged by how the Data Mart API groups the returned data, _by_site, _by_county, _by_state, by
Monitors: Returns operational information about the samplers (monitors) used to collect the data. Includes identifying information, operational dates, operating organizations, etc. Functions using this service contain *monitors_by_* in the function name.
Sample Data: Returns sample data - the most fine grain data reported to EPA. Usually hourly, sometimes 5-minute, 12-hour, etc. This service is available in several geographic selections based on geography: site, county, state, cbsa (core based statistical area, a grouping of counties), or by latitude/longitude bounding box. Functions using this service contain *sampledata_by_* in the function name. All Sample Data functions accept two additional, optional parameters; cbdate and cedate:
Daily Summary Data: Returns data summarized at the daily level. All daily summaries are calculated on midnight to midnight basis in local time. Variables returned include date, mean value, maximum value, etc. Functions using this service contain *dailydata_by_* in the function name. All Daily Summary Data functions accept two additional parameters; cbdate and cedate:
Annual Summary Data: Returns data summarized at the yearly level. Variables include mean value, maxima, percentiles, etc. Functions using this service contain *annualdata_by_* in the function name. All Annual Summary Data functions accept two additional parameters; cbdate and cedate:
Quality Assurance - Blanks Data: Quality assurance data - blanks samples. Blanks are unexposed sample collection devices (e.g., filters) that are transported with the exposed sample devices to assess if contamination is occurring during the transport or handling of the samples. Functions using this service contain *qa_blanks_by_* in the function name.
Quality Assurance - Collocated Assessments: Quality assurance data - collocated assessments. Collocated assessments are pairs of samples collected by different samplers at the same time and place. (These are “operational” samplers, assessments with independently calibrated samplers are called “audits”.). Functions using this service contain *qa_collocated_assessments_by_* in the function name.
Quality Assurance - Flow Rate Verifications: Quality assurance data - flow rate verifications. Several times per year, each PM monitor must have it’s (fixed) flow rate verified by an operator taking a measurement of the flow rate. Functions using this service contain *qa_flowrateverification_by_* in the function name.
Quality Assurance - Flow Rate Audits: Quality assurance data - flow rate audits. At least twice year, each PM monitor must have it’s flow rate measurement audited by an expert using a different method than is used for flow rate verifications. Functions using this service contain *qa_flowrateaudit_by_* in the function name.
Quality Assurance - One Point Quality Control Raw Data: Quality assurance data - one point quality control check raw data. At least every two weeks, certain gaseous monitors must be challenged with a known concentration to determine monitor performance. Functions using this service contain *qa_one_point_qc_by_* in the function name.
Quality Assurance - pep Audits: Quality assurance data - performance evaluation program (pep) audits. pep audits are independent assessments used to estimate total measurement system bias with a primary quality assurance organization. Functions using this service contain *qa_pep_audit_by_* in the function name.
Transaction Sample - AQS Submission data in transaction Format (RD): Transaction sample data - The raw transaction sample data uploaded to AQS by the agency responsible for data submissions in RD format. Functions using this service contain *transactionsample_by_* in the function name. Transaction sample data is only available aggregated by site, county, state or monitoring agency.
aqs_annualsummary_by_site aqs_dailysummary_by_site aqs_monitors_by_site aqs_qa_blanks_by_site aqs_qa_collocated_assessments_by_site aqs_qa_flowrateaudit_by_site aqs_qa_flowrateverification_by_site aqs_qa_one_point_qc_by_site aqs_qa_pep_audit_by_site aqs_sampledata_by_site aqs_transactionsample_by_site
functions in this family of functions aggregate data at the site level. All *_by_site functions accept the following variables:
aqs_annualsummary_by_county aqs_dailysummary_by_county aqs_monitors_by_county aqs_qa_blanks_by_county aqs_qa_collocated_assessments_by_county aqs_qa_flowrateaudit_by_county aqs_qa_flowrateverification_by_county aqs_qa_one_point_qc_by_county aqs_qa_pep_audit_by_county aqs_sampledata_by_county aqs_sites_by_county aqs_transactionsample_by_county
functions in this family of functions aggregate data at the county level. All functions accept the following variables:
aqs_qa_blanks_by_MA aqs_qa_collocated_assessments_by_MA aqs_qa_flowrateaudit_by_MA aqs_qa_flowrateverification_by_MA aqs_qa_one_point_qc_by_MA aqs_qa_pep_audit_by_MA aqs_transactionsample_by_MA
functions in this family of functions aggregate data at the Monitoring Agency (MA) level. All functions accept the following variables:
aqs_annualsummary_by_cbsa aqs_dailysummary_by_cbsa aqs_monitors_by_cbsa aqs_sampledata_by_cbsa
functions in this family of functions aggregate data at the Core Based Statistical Area (cbsa) level. All functions accept the following variables:
aqs_qa_blanks_by_pqao aqs_qa_collocated_assessments_by_pqao aqs_qa_flowrateaudit_by_pqao aqs_qa_flowrateverification_by_pqao aqs_qa_one_point_qc_by_pqao aqs_qa_pep_audit_by_pqao
functions in this family of functions aggregate data at the Primary Quality Assurance Organization (pqao) level. All functions accept the following variables:
aqs_annualsummary_by_box aqs_dailysummary_by_box aqs_monitors_by_box aqs_sampledata_by_box
Functions in this family of functions aggregate data by a latitude/longitude bounding box (_by_box) level. All functions accept the following variables:
aqs_annualsummary_by_state aqs_counties_by_state aqs_dailysummary_by_state aqs_monitors_by_state aqs_qa_blanks_by_state aqs_qa_collocated_assessments_by_state aqs_qa_flowrateaudit_by_state aqs_qa_flowrateverification_by_state aqs_qa_one_point_qc_by_state aqs_qa_pep_audit_by_state aqs_sampledata_by_state aqs_transactionsample_by_state
functions in this family of functions aggregate data at the state level. All functions accept the following variables:
These are miscellaneous functions exported by RAQSAPI.
RAQSAPI::aqs_removeheader is the function that the RAQSAPI library uses internally to coerce an AQS_DATAMART_APIv2 S3 object into a tibble. This is useful if the user has saves the output from another RAQSAPI function with return_header = TRUE set but later decides that they want just a simple tibble object. This function takes only one variable:
In Functions that have the return_header=TRUE option set the returned object is an AQSAPI_v2 object, this is a 2 item list where the first object is a tibble with the label $Header the second, a tibble with the label $Data. sampledata functions are limited by the API to one calendar year of data per API call so if the user requests multiple years of data the sampledata call will return multiple AQSAPI_v2 objects, one for each call to the API. The returned result is a list of AQSAPI_v2 objects. In R to access the data in each item in the list the user will need to use the “double bracket operator” (“[[", "]]”) not the single bracket operator (“[", "]”).
RAQSAPI’s rate limit does not guarantee that the user will not go over the rate limit and does not guarantee that API calls do not overload the AQS Data Mart system, each user should monitor their requests independently.↩︎
Use “?aqs_sign_up” after the RAQSAPI library has been loaded to see the full usage description of the aqs_sign_up function.↩︎
Use “?aqs_credentials” after the RAQSAPI library has been loaded to see the full usage description of the aqs_credentials function.↩︎
See (https://aqs.epa.gov/aqsweb/documents/data_api.html) for full details of the Data Mart API↩︎