How to read PDF file in Apache JMeter?

In this article you are going to learn about how to read PDF file in Apache JMeter? In Apache JMeter, it is possible to download and read PDF file by simply creating custom requests.

How to read download file in Apache JMeter?

First, you need to understand how the application is designed, i.e. how the PDFs are getting loaded onto the page. In most cases, PDFs often load within iframe tags. But each request will generate unique PDFs. You need to capture which request generates the PDF on the page. This can be done using Fiddler or Developer Tools from the browser. I always use Google Developer Tools or IE Developer Tools.

Open the developer console in your browser and repeat the business actions. In my case I saw one POST request which sent the session ID, auth ID, and few unique data. Its response was PDF output. But when I recorded in JMeter, this particular request was not recorded.

Hence, I created a custom sampler in Apache JMeter and sent the data as generated in the developer tool. Finally, I got the output. By default, PDFs response will look as shown below.

PDF Output
PDF Output

It is not in a readable format. To download the whole PDF, you need to add the below elements to your test plan.

  1. Regular Expression Extractor to extract the complete response of PDF
    1. Use following regular expression (?s)<^.*) which saves the complete response and save it to the variable pdfresponse
Regular Expression Extractor
Regular Expression Extractor
  1. Save Responses to a file which save the complete response as a file
    1. Now configure Save Responses to a file as shown below which will save the PDF to your JMETER_HOME\bin folder
Save Responses to a file
Save Responses to a file

How to read PDF file in Apache JMeter?

To read/parse contents from the document, you need to download the jar file from the following URL http://www.apache.org/dyn/closer.cgi/tika/tika-app-1.11.jar

You need to download the jar file and place it in the JMETER_HOME\lib folder. Restart JMeter. Tika supports only the following formats. For more details, visit http://tika.apache.org/1.7/formats.html

  • HyperText Markup Language
  • XML and derived formats
  • Microsoft Office document formats
  • OpenDocument Format
  • iWorks document formats
  • Portable Document Format
  • Electronic Publication Format
  • Rich Text Format
  • Compression and packaging formats
  • Text formats
  • Feed and Syndication formats
  • Help formats
  • Audio formats
  • Image formats
  • Video formats
  • Java class files and archives
  • Source code
  • Mail formats
  • CAD formats
  • Font formats
  • Scientific formats
  • Executable programs and libraries
  • Crypto formats

Now the next step is to read the contents from PDF which is a little tricky. Follow the below steps carefully.

  1. Add new HTTP Sampler and configure as shown below.
    1. Protocol as file
    2. Path is: complete PDF path
Custom HTTP Request
Custom HTTP Request
  1. Add the regular expression extractor to retrieve your desired value from PDF.
    1. When adding regular expression extractor, make sure that you are selecting Body as a Document as a field to check as shown below. If you select other fields, it will not extract/read contents from PDF.
Body as a Document
Body as a Document
  1. Add View Results Tree listener for debugging purpose and select Document format instead of Text as shown below.
Document Format
Document Format

Only if you select Document as output, you can see the exact content from PDF. By regular expression extractor you can retrieve the values and compare them for verification purposes.

Conclusion

To read contents from PDF, Excel, RTF, Office documents, you need to download the jar file from Tika and place it under the lib folder. By changing the output type to Document in View Results Tree, you can validate the output content.

About the Author

9 thoughts on “How to read PDF file in Apache JMeter?”

  1. Hi ,

    I have created jmeter script where report are downloaded in Encrypted PDF format. I have done above all process mentioned but it is not working.

    Can you suggest me how it works?

    Reply
  2. hi sir everything worked but the pdf in readable format is not showing in view results tree after keeping in document view also.in debug sampler also not showing. please suggest how to see in readable format

    Reply

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Hamster - Launch JMeter Recent Test Plans SwiftlyDownload for free
+
Share via
Copy link