Data checking tool

2023-24 ILR data checking tool

The ILR data checking tool is a secure online tool to help further education and sixth form colleges return accurate Individualised Learner Record (ILR) data to the Education and Skills Funding Agency (ESFA).

Providers upload their data to the tool, which produces outputs that allow them to identify potential errors in their data. These errors can then be corrected before the data is formally submitted to the ESFA. Providers often have to use the data checking tool several times before all identified errors are corrected.

For the 2023-24 academic year providers who are registered with the Office for Students (OfS) who submit ILR data (excluding providers who already submit data on the HESA collection), there is an extra return on the Submit Learner Data portal for providers to submit data on the proportion taught in each higher education course Learn Direct Classification System (LDCS) code.

This data was previously captured on the ILR at aim level, however this collection is concentrating on course level information. The subject of study of higher education students is critical to various areas of the OfS’s work, including monitoring compliance against the regulatory conditions, calculation of funding and general analysis.

This year we require providers to submit a zip file containing two files to the Data Checking Tool. One will be the ILR XML file, and the other will be a CSV file containing the proportion taught in each LDCS code.

We do not restrict providers’ use of the data checking tool, but the tool may be slower when there is high demand.

We expect that providers will use the tool for data quality checking, planning purposes, and to increase their understanding of our uses of ILR data.

Providers should allow adequate time within the year to take full advantage of the data checking tool.

2023-24 ILR data

Output

Release date

2023-24 Student numbers data summary

30 May 2024

2023-24 ILR quality control data summary

30 May 2024

2023-24 learner characteristics data summary

30 May 2024

HESES23 comparison (not including funding comparison tables or data verification questions)

30 May 2024

2023-24 Transparency attainment data summary

29 August 2024

Graduate Outcomes survey target list

29 August 2024

National Student Survey target list

29 August 2024

HESES23 comparison (full output, including funding comparison tables and data verification questions)

29 August 2024

2023-24 Student numbers data summary

The student numbers data summary provides the full-time equivalent student numbers by level of study and relevance to TEF. It is formatted to mirror the output we will supply to providers post-collection which will be used for various regulatory purposes, including for setting registration fees, assessing applications for degree awarding powers and assessing applications for university title and university college title. Some minor changes expected to the post-collection output to include processes such as deduplication.

2023-24 ILR quality control summary

The ILR quality control data summary identifies potential errors in records of higher education. Data quality checks include looking at data returned as missing or 'unknown', combinations of fields that appear less credible and the consistency of data reporting between years.

2023-24 learner characteristics data summary

The ILR learner characteristics data summary shows how learners may be categorised for OfS regulatory approaches. It covers different types of characteristics, which includes personal characteristics and characteristics about their courses such as subject of study and where they are taught.

HESES23 comparison

The HESES23 comparison compares 2023-24 ILR data with the provider’s HESES23 data returned to the OfS at the start of the 2023-24 academic year. The HESES23 comparison workbook will be revised later in the year to include a comparison between the provider’s latest 2024-25 OfS teaching grant and grant modelled using the 2023-24 ILR data.

The algorithms used to generate the HESES23 comparison output are intended to be those used for the post-collection outputs next spring. However, we may make changes where we believe they will improve our algorithms.

Graduate Outcomes survey target list

The 2023-24 ILR data submitted to the data checking tool will be used to produce provisional target lists of students to be included in the 2023-24 Graduate Outcomes survey.

Colleges are encouraged to use these lists to check that the data submitted to ESFA will generate a valid target list.

Once we have received the 2023-24 ILR R14 data from the ESFA, we will use it to produce colleges’ target lists.

NSS target list

The 2023-24 ILR data submitted to the data checking tool will be used to produce a provisional target list of students to be included in the 2024 NSS. Providers are encouraged to use these lists to start preparing the contact details for these students.

These will be passed to the agency running the survey. These contact details will need to be returned to the agency in November 2024. Further details about the arrangements for the 2025 NSS will be published in due course.

In October 2024, the OfS will extract final target lists for the 2025 NSS from data submitted to the data checking tool. Colleges should therefore ensure that the most recent data submitted to the data checking tool by this point generates a complete target list.

Once we have received the 2023-24 ILR R14 data from the ESFA, we will use it to validate colleges’ target lists. OfS staff will routinely access NSS target lists generated by the data checking tool.

2023-24 Transparency attainment data summary

The Transparency attainment data summary shows the number of UK-domiciled undergraduate qualifiers in the 2023-24 ILR data by qualification classification achieved, mode of study, ethnicity, sex, and English Index of Multiple Deprivation 2019. It is formatted to mirror the output we will supply to providers post-collection to publish under condition F1 of the regulatory framework, with some minor changes expected to the post-collection output to include processes such as deduplication and year on year linking. The file from the data checking tool must not be published and must only be used to help providers verify and correct their data.

The data checking tool accepts the data format used for the Individualised Learner Record (ILR). This means that you can submit your data to the tool in exactly the same format as your actual data return. Details of the formats for particular academic years can be downloaded from the Education and Skills Funding Agency website.

This year we require providers to submit two files to the Data Checking Tool. One will be the ILR XML file, and the other will be a CSV file containing the proportion taught in each LDCS code. These two files should be uploaded together to the Data Checking Tool within a zip file.

So that you can use the tool as early as possible in the data collection process, the data checking tool will produce outputs even when supplied with invalid or incomplete data. However, these outputs may not be as reliable if you upload invalid data to the tool. To ensure your outputs are reliable, you should first correct any errors identified using the Funding Information System (FIS).

The ILR data checking tool can only receive and process zip files provided that the contents are in the correct format when unzipped.

Please ensure that you are uploading a single zip file that contains two files inside. One of the files should be the ILR XML file, and the other should be a CSV file containing the proportion taught in each LDCS code.

After you have uploaded the relevant file, click ‘submit’. This places the file(s) added in the previous step into the processing queue. The page will update automatically to show the current stage of processing. To cancel your submission while your files are in the queue, select ‘cancel’.

ILR data should be uploaded to the tool via the OfS portal. You will need to gain access to the relevant area of the portal by contacting your provider’s OfS portal user administrator. If you are not sure who your user administrator is, please contact [email protected].

To upload data, navigate to the ‘Surveys’ heading of the portal, and click on the data checking tool link for the relevant year; this will take you to the OfS data checking tool main page for that year.

ILR files can be uploaded one at a time on the data checking tool main page. Selecting ‘Upload’ will take you to a page detailing how to upload a file to the portal. Once you have selected the file using the browse facility click 'Upload' which will take you back to the main page.

After you have uploaded the relevant file, click ‘Submit’. This places the file(s) added in the previous step into the processing queue. The page will update automatically to show the current stage of processing. To cancel your submission while your files are in the queue, select ‘Cancel’.

Files uploaded to the portal can only be processed one at a time, so files are placed in a processing queue. Once your files are in the processing queue, you do not need to stay logged in to the portal; you are free to leave the page at any time and your data will still be processed.

Because files can only be processed one at a time, the total time taken to process files (i.e. the time that the file is in the queue plus the time taken to actually process the file) will be longer when there are more files in the queue.

Once the files have been processed, you can access the outputs by selecting ‘Results’.

If the outputs could not be generated, the reason may be given in the ‘History’ section. If no reason is given, it is likely that the outputs could not be generated because erroneous data was uploaded.

Please email us at [email protected] if you have issues in uploading your data.

How long does it take to process the file(s)?

Our systems only allow one submission to be processed at a time; therefore if more than one provider submits data, the submissions will be placed in a queue. The total time taken to complete the re-creation, including waiting time, will depend on the number of providers in the queue.

How can I gain access to the data checking tool area?

You will need to gain access to the relevant area of the portal by contacting your provider’s OfS portal user administrator. If you are not sure who your user administrator is, please contact [email protected].

What do I need to do if an error occurs while processing?

Ensure that:

  • the files you used are the correct ones

  • the files have passed ILR validation and are in the correct format

  • all data retains its leading zeros, by viewing the file in a text editor such as Notepad

  • you are uploading a single zip file which contains the ILR XML, as well as a CSV containing the proportion taught in each LDCS code. 

If all the files are correct and you still have an error message, please email [email protected].

I have not (or have only recently) registered a course on the Learning Aims Search and as a result records are being excluded from one or more of the outputs. What should I do?

In order to work around this issue, the OfS can apply temporary overrides to include excluded records in the relevant output. If this is the case, please email [email protected] explaining the situation.

Before the OfS can apply overrides, we require confirmation (usually by email) from the ESFA that the course has been created and also a data filter that can be used in your ILR submission that will identify the affected records.

In future, providers should ensure courses are registered on the Learning Aims Search as early as possible to avoid this issue recurring.

Will data amendments for 2022-23 and earlier years be incorporated in these outputs?

Please see the technical documents for each individual output for more information on whether they include amendments for 2022-23 and earlier years.

What version of the LARS database will be used?

The version of the ESFA’s learning aims reference service database that we refer to will initially be taken in May 2024 and will be updated throughout the year. Although we are not using the total qualification time field from this database in any of the data checking tool outputs, we may use this field in our algorithms in future, including in calculations for registration and ongoing conditions of registration.

How will the data I supply to create the outputs be used by the OfS?

The data checking tool is a tool for providers. OfS staff will access data submitted to, or derived by, the data checking tool in the following circumstances:

  • to ask data verification questions and consider providers’ responses to them
  • to assist providers with queries about their outputs and data
  • to test the functionality of forthcoming outputs and the application of overrides
  • to produce early modelling
  • to verify future data returns
  • where a provider explicitly gives permission.

We do not intend to use data processed by the data checking tool to make decisions about individual providers, but may use responses to data verification questions in our assessments of data quality. We can provide further information on how we handle personal data on request.

I have feedback about the algorithms used in the data checking tool outputs

We would be grateful for feedback on the algorithms used to create the data checking tool outputs, particularly where they have changed from previous years. The algorithms used for each output will be made available on the OfS website once released. Feedback should be emailed to [email protected].

I have other questions not answered here

For any further information or guidance please email [email protected].

2023-24 ILR quality control summary

29 August 2024

We updated the 2023-24 ILR DCT quality control technical document and rebuild instructions to reference the updated algorithm for OFSQAIM published with the release of the 2023-24 ILR DCT classifying learning aims technical document. Before this update, the quality control output used the version of the algorithm published in HESES23 Course table information. We also updated the document to show that CAMPID had been added as a field in the quality control individualised file, to assist with the return of the additional LDCS collection csv required this year.

Last updated 29 August 2024
29 August 2024
Published documentation as set out in the release schedule section.
30 May 2024
Added technical documentation for 2023-24.
25 April 2024
Updated with 2023-24 ILR data checking information
25 August 2023
Published: 2022-23 ILR - GO22 technical document; 2022-23 ILR - 2024 NSS target list technical document
18 August 2023
The 2022-23 ILR quality control technical document has been updated to reflect changes after two updates were made to the ILR quality control outputs.
16 August 2023
The 2022-23 ILR data checking tool HESES22 comparison technical document and rebuild document were updated; the transparency technical document was published.
07 July 2023
Added documentation for classifying learning aims.
22 June 2023
Minor corrections to two documentation files ('2022-23 ILR HESES22 comparison technical document' and '2022-23 ILR HESES22 comparison rebuild instructions').
13 June 2023
Three technical documents published: 2022-23 ILR HESES22 comparison technical document, 2022-23 ILR HESES22 comparison rebuild instructions, 2022-23 ILR learner characteristics technical document and rebuild instructions
02 May 2023
Added two technical documents for ILR
04 April 2023
Updated for 2022-23
24 August 2022
Added technical documents for: 2023 NSS target list; Graduate Outcomes 21
18 August 2022
The HESES21 comparison workbooks were revised to include funding modelled using 2021-22 ILR data.
04 August 2022
Documentation added for 2021-22
06 July 2022
Updated for 2021-22
22 September 2021
Update added
31 August 2021
The ‘2020-21 ILR - HESES20 comparison rebuild instructions’ were updated to include details on how the 2021-22 funding allocations can be modelled using the 2020-21 ILR data, where HESES20 data has previously been used.
25 August 2021
2022 NSS target list technical document added
24 August 2021
Algorithm changed for 2020-21 ILR quality control output - see 'Updates'
11 August 2021
Technical documentation published
09 August 2021
The release schedule was updated to make corrections to the release dates for the Graduate Outcomes and National Student Survey outputs; and to add confirmed release dates for HESES20 comparison.
04 August 2021
Added 2020-21 Transparency attainment data summary technical document and rebuild instructions
08 July 2021
Updated for 2020-21
12 October 2020
Update to the 'Quality control data summary: technical algorithms and rebuild instructions' technical document
01 September 2020
Technical documentation added for the following outputs: Transparency 2019-20 attainment summary, National Student Survey target list, Graduate Outcomes survey target list
14 August 2020
Quality control data summary technical document published
12 August 2020
Technical documentation added for the following outputs: HESES19 comparison, Higher education level apprenticeship data summary, 2019-20 student numbers data summary
23 July 2020
Update for the 2019-20 data checking tool

Describe your experience of using this website

Improve experience feedback
* *

Thank you for your feedback