Fork me on GitHub

Common Errors

This document is a living document intended to be a kind of "decoder ring" for data providers attempting to debug common validation errors. It is obviously very light on helpful information at this time, but it will continue to grow as we come across additional use cases.

If you would like to contribute to this page, please create a pull request or issue on our GitHub repository or email the PDS Operator with input. Thanks in advance for helping your fellow users.

White spaces are required error

Execution of the Validate Tool may result in the following message appearing in the log:

FAIL: file:/Users/.../hi0173794441_9080000_001_r.xml
    FATAL_ERROR  line 1, 55: White spaces are required between publicId and systemId.
      

The message above is generated by the underlying Xerces library that is utilized by the Validate Tool for XML Schema validation. Although not very intuitive, the message normally indicates that the XML Schema for the default namespace of the target label is missing. In the example above the default namespace was "http://pds.nasa.gov/pds4/pds/v03" but the XML Schema file describing that namespace (PDS4_PDS_0300a.xsd) was not provided to the tool at runtime.

java.lang.OutOfMemoryError

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded

Running the tool against a large bundle may result in an OutOfMemoryError exception appearing in the standard error similar to the following:

Sep 19, 2017 12:02:39 PM gov.nasa.pds.tools.label.LocationValidator validate
INFO: Using validation style 'PDS4 Directory' for location file:/home/atmos7/anonymous/PDS/data/PDS4/MAVEN/iuvs_calibrated_bundle/
Sep 19, 2017 12:02:39 PM gov.nasa.pds.tools.validate.task.ValidationTask execute
INFO: Starting validation task for location 'file:/home/atmos7/anonymous/PDS/data/PDS4/MAVEN/iuvs_calibrated_bundle/'
Sep 22, 2017 7:07:31 AM gov.nasa.pds.tools.validate.task.ValidationTask execute
INFO: Validation complete for location 'file:/home/atmos7/anonymous/PDS/data/PDS4/MAVEN/iuvs_calibrated_bundle/'
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded      
      

When this OutOfMemoryError exception is thrown, no report is generated. The cause of this issue is due to the tool caching the valdiation results in memory until the end of the validation run. To resolve this issue, the JVM heap space setting allocations should be increased. It is recommended to increase the heap space settings to -Xms4096m -Xmx8192m. The following details how to update these settings in the tool depending on the target platform.

For Unix-Based Environments,

Update the JVM heap space allocation settings in the validate shell script to the following:

"${JAVA_HOME}"/bin/java -Xms4096m -Xmx8192m -jar ${VALIDATE_JAR} "$@"
      

For Windows Environments,

Update the JVM heap space allocation settings in the validate.bat batch file to the following:

"%JAVA_HOME%"\bin\java -Xms4096m -Xmx8192m -jar "%VALIDATE_JAR%" %*
      

No checksum found in the manifest errors

When performing Checksum Manifest file validation, having the wrong base path setting will result in multiple errors like the following:

FAIL: file:/home/pds4/dph_example_archive_VG2PLS/browse/Collection_browse.xml
    ERROR  No checksum found in the manifest for 'file:/home/pds4/dph_example_archive_VG2PLS/browse/Collection_browse.xml'.     
      

To resolve this issue, check that the base path setting correctly resolves the relative file references (if present) in the Checksum Manifest file by looking at the Manifest File Base Path parameter specified at the top of the Validate Tool report. If the base path is incorrect, specify the correct one on the command line using the -B, --base-path flag option.

error.table.fields_mismatch

Debugging

The error indicates the fields do not match the table defined in the label. Here are some things you can check to find out where it is failing:

  • Table offset value - if the offset is incorrect, the validate reader will start in the incorrect place in the data product, causing the fields to not line up properly. See this issue on Github as an example.
  • Table field_delimiter - make sure this value is as expected
  • Record fields value

error.label.context_ref_mismatch

Description

New to Validate 1.16.0, this feature does a performs a check of the name and type values associated with context products that are present in your label (e.g. Target_Identification, Investigation_Area, Observing_System_Component), with the values in the context products as registered in the PDS. In order to enable accurate search results, these values need to be the same as much as possible.

Fixing the problem

To fix the problem you have a few options:

  • For data archived prior to v1.16.0 - you can either reprocess, note the errors in your readme, or turn off context validation (--skip-context-validation)
  • For new data archives - it is highly recommended you either update the context product or the data labels

Invalid maximum heap size

Description

When attempting to run validate, you get the following error:

Invalid maximum heap size: -Xmx4096m
The specified size exceeds the maximum representable size.
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
      

Fixing the problem

This error is most likely due to using 32-bit Java, instead of the required 64-bit Java to run Validate. To test if you are running the right version, note the last line from this output should say 64-Bit Server VM. If not, you will need to re-install the correct version. :

$ java -version
openjdk version "15.0.1" 2020-10-20
OpenJDK Runtime Environment (build 15.0.1+9)
OpenJDK 64-Bit Server VM (build 15.0.1+9, mixed mode, sharing)
      

PDF/A Issues

Description

When attempting to validate a Document product as PDF/A, you get an error like:

        ERROR  [error.pdf.file.not_pdfa_compliant]
      

Fixing the problem

This error indicates your PDF is not PDF/A version 1a or 1b compliant. We are in the process of improving documentation for this, but in the meantime, you can look through the VeraPDF Rules documentation and try to track down the issues occurring in your PDS. Some common issues with documents being converted are:

  • Embedded multimedia - PDF/A does not support embedded multimedia files
  • Proprietary fonts - Fonts protected by copyright cannot be copied into PDF/A. Some non-free fonts may be allowed when working with their originating programs (such as using Microsoft fonts with Word or PowerPoint), but they will prevent a converter from converting them to PDF/A later.
  • PDF/A does not support 3D models
If you are unable to debug, please contact your PDS Node representative or the PDS Operator for assistance generating a compliant PDF/A.