Common Errors
This document is a living document intended to be a kind of "decoder ring" for data providers attempting to debug common validation errors. It is obviously very light on helpful information at this time, but it will continue to grow as we come across additional use cases.
If you would like to contribute to this page, please create a pull request or issue on our GitHub repository or email the PDS Operator with input. Thanks in advance for helping your fellow users.
White spaces are required error
Execution of the Validate Tool may result in the following message appearing in the log:
FAIL: file:/Users/.../hi0173794441_9080000_001_r.xml FATAL_ERROR line 1, 55: White spaces are required between publicId and systemId.
The message above is generated by the underlying Xerces library that is utilized by the Validate Tool for XML Schema validation. Although not very intuitive, the message normally indicates that the XML Schema for the default namespace of the target label is missing. In the example above the default namespace was "http://pds.nasa.gov/pds4/pds/v03" but the XML Schema file describing that namespace (PDS4_PDS_0300a.xsd) was not provided to the tool at runtime.
java.lang.OutOfMemoryError
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
Running the tool against a large bundle may result in an OutOfMemoryError exception appearing in the standard error similar to the following:
Sep 19, 2017 12:02:39 PM gov.nasa.pds.tools.label.LocationValidator validate INFO: Using validation style 'PDS4 Directory' for location file:/home/atmos7/anonymous/PDS/data/PDS4/MAVEN/iuvs_calibrated_bundle/ Sep 19, 2017 12:02:39 PM gov.nasa.pds.tools.validate.task.ValidationTask execute INFO: Starting validation task for location 'file:/home/atmos7/anonymous/PDS/data/PDS4/MAVEN/iuvs_calibrated_bundle/' Sep 22, 2017 7:07:31 AM gov.nasa.pds.tools.validate.task.ValidationTask execute INFO: Validation complete for location 'file:/home/atmos7/anonymous/PDS/data/PDS4/MAVEN/iuvs_calibrated_bundle/' Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
When this OutOfMemoryError exception is thrown, no report is generated. The cause of this issue is due to the tool caching the valdiation results in memory until the end of the validation run. To resolve this issue, the JVM heap space setting allocations should be increased. It is recommended to increase the heap space settings to -Xms4096m -Xmx8192m. The following details how to update these settings in the tool depending on the target platform.
For Unix-Based Environments,
Update the JVM heap space allocation settings in the validate shell script to the following:
"${JAVA_HOME}"/bin/java -Xms4096m -Xmx8192m -jar ${VALIDATE_JAR} "$@"
For Windows Environments,
Update the JVM heap space allocation settings in the validate.bat batch file to the following:
"%JAVA_HOME%"\bin\java -Xms4096m -Xmx8192m -jar "%VALIDATE_JAR%" %*
Still not working?
See Operate documentation on how to improve performance, including ensuring your execution is sufficiently batched.
No checksum found in the manifest errors
When performing Checksum Manifest file validation, having the wrong base path setting will result in multiple errors like the following:
FAIL: file:/home/pds4/dph_example_archive_VG2PLS/browse/Collection_browse.xml ERROR No checksum found in the manifest for 'file:/home/pds4/dph_example_archive_VG2PLS/browse/Collection_browse.xml'.
To resolve this issue, check that the base path setting correctly resolves the relative file references (if present) in the Checksum Manifest file by looking at the Manifest File Base Path parameter specified at the top of the Validate Tool report. If the base path is incorrect, specify the correct one on the command line using the -B, --base-path flag option.
error.table.fields_mismatch
Debugging
The error indicates the fields do not match the table defined in the label. Here are some things you can check to find out where it is failing:
- Table offset value - if the offset is incorrect, the validate reader will start in the incorrect place in the data product, causing the fields to not line up properly. See this issue on Github as an example.
- Table field_delimiter - make sure this value is as expected
- Record fields value
error.label.context_ref_mismatch
Description
New to Validate 1.16.0, this feature does a performs a check of the name and type values associated with context products that are present in your label (e.g. Target_Identification, Investigation_Area, Observing_System_Component), with the values in the context products as registered in the PDS. In order to enable accurate search results, these values need to be the same as much as possible.
Fixing the problem
To fix the problem you have a few options:
- For data archived prior to v1.16.0 - you can either reprocess, note the errors in your readme, or turn off context validation (--skip-context-validation)
- For new data archives - it is highly recommended you either update the context product or the data labels
Invalid maximum heap size
Description
When attempting to run validate, you get the following error:
Invalid maximum heap size: -Xmx4096m The specified size exceeds the maximum representable size. Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit.
Fixing the problem
This error is most likely due to using 32-bit Java, instead of the required 64-bit Java to run Validate. To test if you are running the right version, note the last line from this output should say 64-Bit Server VM. If not, you will need to re-install the correct version. :
$ java -version openjdk version "15.0.1" 2020-10-20 OpenJDK Runtime Environment (build 15.0.1+9) OpenJDK 64-Bit Server VM (build 15.0.1+9, mixed mode, sharing)
PDF/A Issues
Description
When attempting to validate a Document product as PDF/A, you get an error like:
ERROR [error.pdf.file.not_pdfa_compliant]
Fixing the problem
This error indicates your PDF is not PDF/A version 1a or 1b compliant. We are in the process of improving documentation for this, but in the meantime, you can look through the VeraPDF Rules documentation and try to track down the issues occurring in your PDS. Some common issues with documents being converted are:
- Embedded multimedia - PDF/A does not support embedded multimedia files
- Proprietary fonts - Fonts protected by copyright cannot be copied into PDF/A. Some non-free fonts may be allowed when working with their originating programs (such as using Microsoft fonts with Word or PowerPoint), but they will prevent a converter from converting them to PDF/A later.
- PDF/A does not support 3D models