
Research
Supply Chain Attack on Axios Pulls Malicious Dependency from npm
A supply chain attack on Axios introduced a malicious dependency, plain-crypto-js@4.2.1, published minutes earlier and absent from the project’s GitHub releases.
@pdftron/data-extraction
Advanced tools
This package is meant to be used in conjunction with @pdftron/pdfnet-node to support IDP data extraction from Apryse. Follow this guide for more info on usage. https://docs.apryse.com/documentation/core/guides/intelligent-data-extraction/
For further reading checkout our blog post on the project. https://apryse.com/blog/introducing-automated-data-extraction-pdf-idp
This package depends on unmanaged add-on binaries, and the add-on binaries are not cross-platform. At the moment we have support for
Installation will fail if your OS, Node.js or Electron version is not supported.
Add the @pdftron/data-extraction package as a dependency in your package.json
Inside of your @pdftron/pdfnet-node code after initialization you should include the following line:
await PDFNet.addResourceSearchPath("./node_modules/@pdftron/data-extraction/lib")
Here is an example of data extraction being used with this line.
const { PDFNet } = require('@pdftron/pdfnet-node');
const licenseKey = "Insert license key here"
const inputFile = "Insert input file location here"
async function main() {
// This is where we import data-extraction
await PDFNet.addResourceSearchPath("./node_modules/@pdftron/data-extraction/lib")
// Extract document structure as a JSON file
console.log('Extract document structure as a JSON file');
let outputFile = 'out/paragraphs_and_tables.json';
await PDFNet.DataExtractionModule.extractData(inputFile, outputFile, PDFNet.DataExtractionModule.DataExtractionEngine.e_DocStructure);
console.log('Result saved in ' + outputFile);
///////////////////////////////////////////////////////
// Extract document structure as a JSON string
console.log('Extract document structure as a JSON string');
outputFile = 'out/tagged.json';
const json = await PDFNet.DataExtractionModule.extractDataAsString(inputFile, PDFNet.DataExtractionModule.DataExtractionEngine.e_DocStructure);
fs.writeFileSync(outputFile, json);
}
PDFNet.runWithCleanup(main, licenseKey).catch(function (error) {
console.log('Error: ' + JSON.stringify(error));
}).then(function () { return PDFNet.shutdown(); });;
A larger code sample can be found here
To get started please see the documentation at https://www.pdftron.com/documentation/nodejs/get-started/integration.
Please go to https://docs.apryse.com/documentation/core/info/license/ to obtain a demo or production license.
FAQs
The Apryse SDK Data-Extraction Module.
We found that @pdftron/data-extraction demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
A supply chain attack on Axios introduced a malicious dependency, plain-crypto-js@4.2.1, published minutes earlier and absent from the project’s GitHub releases.

Research
Malicious versions of the Telnyx Python SDK on PyPI delivered credential-stealing malware via a multi-stage supply chain attack.

Security News
TeamPCP is partnering with ransomware group Vect to turn open source supply chain attacks on tools like Trivy and LiteLLM into large-scale ransomware operations.