Skip to contents

Preparatory Steps

This section outlines the preparations that are necessary before running the PACTA for Banks analysis, including sourcing the required input data sets and software. If you plan to use the same source data to assess multiple loan books, these steps only need to be done once, except for the creation of individual raw loan book files for each of the loan books you want to analyze.

Required Input Data Sets

The PACTA for Banks analysis requires several input data sets, some from external sources and others prepared by the user. Certain data sets are optional.

The key required inputs include (described in more detail below):

Asset-Based Company Data (ABCD)

  • Required input
  • External source
  • XLSX file

This dataset contains production profiles and emission intensities for companies in climate critical economic sectors: automotive (light-duty vehicles), aviation, cement, coal mining, upstream oil & gas, power generation, and steel. Typically sourced from third-party providers, it can also be self-prepared or supplemented with additional entries.

The ABCD dataset must be a .xlsx file that contains the following columns:

  • company_id: <character>
  • name_company: <character>
  • lei: <character>
  • sector: <character>
  • technology: <character>
  • production_unit: <character>
  • year: <integer>
  • production: <numeric>
  • emission_factor: <numeric>
  • plant_location: <character>
  • is_ultimate_owner: <logical>
  • emission_factor_unit: <character>

PACTA is data-agnostic and supports any provider offering data in the correct format. One option is purchasing ABCD from Asset Impact.

For details on the required structure of this dataset, refer to the ABCD data dictionary.

Scenario Data

  • Required input(s)
  • External source
  • CSV file(s)

For sectors with technology-level trajectories, the dataset provides Sectoral Market Share Percentage (SMSP) and Target Market Share Ratio (TMSR) pathways using the “market share approach”. This allocation rule assumes that all companies in a sector must adjust production while maintaining constant market shares within a sector to align with the overall climate transition scenario.

A more detailed explanation of the market share approach can be found here.

The Target Market Share Scenario dataset must be a CSV file and include the following columns:

  • scenario: <character>
  • sector: <character>
  • technology: <character>
  • region: <character>
  • year: <integer>
  • tmsr: <numeric>
  • smsp: <numeric>
  • scenario_source: <character>

For details on the required structure of this dataset, refer to the market share scenario data dictionary.

For sectors without technology-level pathways, PACTA applies the Sectoral Decarbonization Approach (SDA). This method requires all companies in a sector to align their physical emission intensity with a future scenario target (e.g., by 2050). Higher-emission companies must reduce their intensity more drastically than those already using cleaner technologies. However, SDA does not directly affect production volumes.

A more detailed explanation of the sectoral decarbonization approach can be found here

The SDA scenario dataset must be a .csv file that contains the following columns:

  • scenario_source: <character>
  • scenario: <character>
  • sector: <character>
  • region: <character>
  • year: <numeric>
  • emission_factor_unit: <character>
  • emission_factor: <numeric>

For details on the required structure of this dataset, refer to the SDA scenario data dictionary.

While the raw input values of the scenarios are based on models from external third party organisations - such as the World Energy Outlook by the International Energy Agency (IEA), the Global Energy and Climate Outlook by the Joint Research Center of the European Commission (JRC), or the One Earth Climate Model by the Institute for Sustainable Futures (ISF) - the input data set for PACTA must be prepared using additional steps, which are documented publicly on the following GitHub repositories:

Prepared scenario files are available for download as CSVs in the Methodology and Supporting Documents section of the PACTA website. These files are typically updated annually based on the latest scenario publications. As a general rule, the publication year defines the initial year of the dataset, which is also commonly used as the analysis start year.

Loan Book

  • Required input
  • User-prepared
  • CSV file

The loan book is the financial dataset to be analyzed, detailing the loans provided to companies. Banks can source this data from their internal systems.

The raw loan book must be prepared as CSV files and include, at a minimum, the following columns:

  • id_loan: <character>
  • id_direct_loantaker: <character>
  • name_direct_loantaker: <character>
  • id_ultimate_parent: <character>
  • name_ultimate_parent: <character>
  • loan_size_outstanding: <numeric>
  • loan_size_outstanding_currency: <character>
  • loan_size_credit_limit: <numeric>
  • loan_size_credit_limit_currency: <character>
  • sector_classification_system: <character>
  • sector_classification_direct_loantaker: <character>
  • lei_direct_loantaker: <character>
  • isin_direct_loantaker: <character>

For details on the required structure of this dataset, refer to the loanbook data dictionary.

For detailed descriptions of how to prepare raw loan books, see the “Training Materials” section of the PACTA for Banks documentation. The “User Guide 2”, the “Data Dictionary”, and the “Loan Book Template” files can all be helpful in preparing your data.

Misclassified Loans (optional)

  • Optional input
  • User-prepared
  • CSV file

The misclassified loans CSV file should contain a single column, id_loan, and be structured as follows:

  • id_loan: <character>

Users can provide a list of loans that were misclassified in the raw loan book. The goal is to remove false positives—loans incorrectly classified within a PACTA sector despite manual research confirming the company does not operate in that sector. Misclassification may result from data entry errors, for example. Excluding these loans from sector calculations can help improve the accuracy of the match success rate assessment.

Manual Sector Classification (optional)

  • optional input
  • self-prepared
  • CSV file

The manual sector classification dataset must be prepared as a CSV file and contain the following columns:

  • sector: <character>
  • borderline: <logical>
  • code: <character>
  • code_system: <character>

In case the user cannot obtain sector classification codes of any of the classification systems featured in sector_classifications (currently the following classification systems are featured: GICS, ISIC, NACE, NAICS, PSIC, SIC), the user can provide a manually created sector classification file for matching the loan book to the ABCD instead. Generally, any such manually prepared sector classification file must follow the format of sector_classifications. It is recommended to use the built in sector classifications if possible, as mapping your own sector classification to the PACTA sectors can be complex and time consuming.

Required Software

Using the pacta.loanbook package for the PACTA for Banks analysis requires the following software to be installed on your system:

R (version 4.1.0 or higher)

R is the programming language that the pacta.loanbook package is written in. You can download R from the Comprehensive R Archive Network (CRAN).

RStudio (optional)

RStudio is an integrated development environment (IDE) for R developed by Posit. It is not strictly required to run the analysis, but it can be helpful for managing your project and running the analysis. Generally, RStudio is very widely used among the R community and probably the easiest way to interact with most R tools, such as the pacta.loanbook suite of packages. RStudio Desktop is an open source tool and free of charge. You can download RStudio from the Posit RStudio website.

{pacta.loanbook} package

The pacta.loanbook R package is the main software tool that you will use to run the PACTA for Banks analysis.

You can install the pacta.loanbook R package from any CRAN mirror by running the following command in R:

install.packages("pacta.loanbook")

Alternatively, you can install the development versions of the pacta.loanbook R package from GitHub with:

# install.packages("pak")
pak::pak("RMI-PACTA/pacta.loanbook")

We use the pak package as a simple tool to install packages from GitHub.

Connecting to GitHub from RStudio

Note that if you choose to install the pacta.loanbook R package from GitHub, you will need to have:

  1. registered a GitHub account,
  2. git installed locally,
  3. set up credentials so that RStudio can communicate with GitHub.

You can find more information on how to do this using the following resources:

  • Happy Git and GitHub for the useR is a great and comprehensive resource that takes you through the process of setting up git and GitHub with RStudio, including registering a GitHub account, installing git, and connecting RStudio to GitHub.
  • Additional information on managing your GitHub connection from within RStudio can be found in the usethis package documentation, for example on managing git credentials.

If you only plan to use GitHub to install this package or other packages as shown above, you will not have to have a deep understanding of all the git commands, so there is no need to be overwhelmed by the complexity of git.

Required R packages

The pacta.loanbook R package depends on a number of other R packages. These dependencies will be installed automatically when you install the pacta.loanbook R package. The required packages are:

cli, data.table, dplyr, ggplot2, ggrepel, glue, lifecycle, magrittr, purrr, r2dii.analysis, r2dii.data, r2dii.match, r2dii.plot, rlang, rstudioapi, scales, stringdist, stringi, stringr, tibble, tidyr, tidyselect, zoo

FAQ

How do I install the {pacta.loanbook} R package?

The most common ways to install R packages are via CRAN or GitHub. Public institutions often have restrictions on the installation of packages from GitHub, so you may need to install the package from CRAN. In some cases, your institution may mirror CRAN in their internal application registry, so you may need to install the package from there. Should you have any issues with the installation from the internal application registry, it is best to reach out to your IT department. If you cannot obtain the package in any of these ways, please reach out to the package maintainers directly for exploring other options.

How do I install the required R packages?

In principle, all dependencies required to run the pacta.loanbook R package will be installed automatically when you install the package. However, if you encounter any issues with the installation of the required packages, you can install them manually by running the following command in R, where ... should be replaced with the package names from the list above, separated by commas:

Checklist of Preparatory Steps

Before running the PACTA for Banks analysis, you should make sure that you have completed the following preparatory steps:


PREVIOUS CHAPTER: Overview

NEXT CHAPTER: Running the Analysis