Tag Archives: PDF

[Solved] MacOS Monterey Use bizhub C308 to Print PDF Error: offending command: binary token type 151

Problem Restatement:

As shown in the title, when using MAC to print a paper today, the paper is generally terminated, and the following errors occur:

Error syntax error

OFFENDING COMMAND: binary token, type=151

STACK: 

At first I thought it was a driver problem. I reinstalled the latest driver of c308 and found that there were still errors. Not all PDF sheets failed to print, and some of them failed to print.

Through some exploration, it is found that it is a software problem. Using the preview software of MAC to call the printer to print PDF will cause the above problems;

Solution:

By using standard adobe reader to reprint the above files, it is perfectly solved.

Reason guess:

I guess that the preview program of MAC uses some commands incompatible with the printer, which makes it impossible to print.

When adobe reader is used for printing, its software will process PDF files and output compatible printing instructions, so it can print normally.

Analysis of pdfbox’s error in converting PDF file to picture cannot read JPEG2000 image and the introduction of JPEG and JPEG2000

1、 Problem background

1. How to fix “cannot read JPEG2000 Image: Java advanced imaging (Jai) image I/O tools are not installed”

I’m building a java project to get images from PDF using pdfbox. Because I’m using Tika app for other functions, I decided to use the pdfbox in tika-app-1.20.jar.

I have tried to include jai-imageio-core-1.3.1.jar because Tika app is bundled with this jar. I tried using Tika app jar alone.

The code that throws the error:

PDXObject object=resources.getXObject(cosName);

Bad log trace:

org.apache.pdfbox.filter.MissingImageReaderException: Cannot read JPEG2000 image: 
Java Advanced Imaging (JAI) Image I/O Tools are not installed at org.apache.pdfbox.filter.Filter.findImageReader(Filter.java:163) at org.apache.pdfbox.filter.JPXFilter.readJPX(JPXFilter.java:115) at org.apache.pdfbox.filter.JPXFilter.decode(JPXFilter.java:64) at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:77) at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:175) at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:163) at org.apache.pdfbox.pdmodel.common.PDStream.createInputStream(PDStream.java:236) at org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.<init>(PDImageXObject.java:140) at org.apache.pdfbox.pdmodel.graphics.PDXObject.createXObject(PDXObject.java:70) at org.apache.pdfbox.pdmodel.PDResources.getXObject(PDResources.java:426)

But I’m sure I have the Jai imageio kernel in Tika, which is invisible when I run the code.

Solution:

1. It happens that it requires an additional jar known as jai-imageio-jpeg 2000 to support jp2k images.

2. In fact, I found this error by chance, but it is mentioned in the pdfbox document here. You need to add the following dependencies to pom.xml:

<dependency>
    <groupId>com.github.jai-imageio</groupId>
    <artifactId>jai-imageio-core</artifactId>
    <version>1.4.0</version>
</dependency>

<dependency>
    <groupId>com.github.jai-imageio</groupId>
    <artifactId>jai-imageio-jpeg2000</artifactId>
    <version>1.3.0</version>
</dependency>

<!-- Optional for you ; just to avoid the same error with JBIG2 images -->
<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>jbig2-imageio</artifactId>
    <version>3.0.3</version>
</dependency>

To avoid the same error in JBIG2 images, you can add the following dependency.

If you are using gradle, add dependencies like this:

dependencies {
    implementation 'com.github.jai-imageio:jai-imageio-core:1.4.0'
    implementation 'com.github.jai-imageio:jai-imageio-jpeg2000:1.3.0'

    // Optional for you ; just to avoid the same error with JBIG2 images
    implementation 'org.apache.pdfbox:jbig2-imageio:3.0.3'
}

2、 Project instance scenario

1. Problem scenario:

In the project, there is a bug in the document conversion – PDF to picture function (using pdfbox2.0.2). Many picture elements in the original PDF file disappear after being converted into pictures. This is not bad. Check the log and find a large number of errors:

ERROR   o.a.p.contentstream.PDFStreamEngine   eight hundred and ninety  –  Cannot   read   JPEG2000   image:   Java   Advanced   Imaging   (JAI)   Image   I/O   Tools   are   not   installed

This means that there is a lack of I/O tools to read JPEG2000 images. The problem should be here.

2. Cause of the problem: the scanned image that may be included in the PDF file is a picture in JPEG2000 format, so the pdfbox needs the support of Jai in the conversion process.

3. Solution: add related dependencies

Overseas stackoverflow: https://stackoverflow.com/questions/42169154/pdfbox1-8-12-convert-pdf-to-white-page-image , the last humble little reply gave inspiration and added dependence

4. Comparison:

Before adding dependency – picture missing

After adding dependency – picture display

3、 Introduction to JPEG and JPEG2000

1. Background:

The full name of JPEG is joint photographic experts group. It is a committee engaged in the formulation of still image compression standards under the international standards organization (ISO). It has formulated the first set of national standard still image compression standard: ISO 10918-1, which is commonly known as JPEG. Due to the excellent quality of JPEG, it has achieved great success in just a few years. At present, 80% of the images on the website adopt JPEG compression standard.

However, with the rapid development of multimedia applications, the traditional JPEG compression technology can not meet the requirements of people for multimedia image data. Therefore, JPEG 2000, a new generation of still image compression technology with higher compression rate and more new functions, was born. The official name of JPEG 2000 is “ISO 15444”, which is also formulated by JPEG organization.

2. Basic concepts

JPEG 2000 is an image compression standard based on wavelet transform, which is created and maintained by the Joint Photographic Experts Group. JPEG 2000 is generally considered as the next generation image compression standard to replace JPEG (based on discrete cosine transform) in the future. The extension name of JPEG 2000 file is usually. JP2, and the MIME type is image/JP2.

JPEG2000 has a higher compression ratio and will not produce the block blur artifacts generated by the original JPEG standard based on discrete cosine transform.

JPEG2000 supports both lossy compression and lossless compression.

In addition, JPEG2000 also supports more complex progressive display and download.

Because JPEG2000 can still have a good compression rate under lossless compression, JPEG2000 has been widely used in the analysis and processing of medical images with high image quality requirements.

3. JPEG2000 principle

The biggest difference between JPEG 2000 and traditional JPEG is that it abandons the block coding method based on discrete cosine transform, and adopts the multi analysis coding method based on wavelet transform.

The main purpose of wavelet transform is to extract the frequency components of the image. Refer to the following figure for a simple schematic diagram.

4. JPEG2000 benefits

(1) As an upgraded version of JPEG, JPEG 2000 aims at high compression (low bit rate), and its compression rate is about 30% higher than JPEG
(2) JPEG2000 supports both lossy and lossless compression, while JPEG can only support lossy compression. Therefore, it is suitable for saving important pictures

(3) JPEG2000 can realize progressive transmission, which is an extremely important feature of JPEG2000. This is our understanding of GIF

The “fade out” characteristic of format image. It first transmits the outline of the image, and then gradually transmits the data to continuously improve the image quality, so that the image can be displayed from hazy to clear, instead of JPEG

Same, displayed slowly from top to bottom
(4) JPEG2000 supports the so-called “region of interest” feature. You can arbitrarily specify the compression quality of the region of interest on the image, and you can also select the specified part to decompress first. In this way, we can easily highlight the key points.

5. JPEG2000 copyright and patent issues

JPEG2000 has copyright and patent risks. This may be one of the reasons why JPEG2000 technology has not been widely used at present.

JPEG2000 standard itself has no licensing fee. However, because a large number of algorithms in the core part of coding are patented, it is generally considered that it is unlikely to avoid these patent fees to develop a commercial encoder free of licensing fee.

[Solved] Pdfplumber Parsing PDF error: ValueError: not enough values to unpack (expected 2, got 1)

Question:

Environment: Ubuntu 18.04, electronic invoice PDF document

Cause: the cause of the problem is not clear. The error can only be determined after pdfminer recompiled and installed for Chinese fonts (if any God knows, please let us know)

Solution: modifying the source code is actually a layer of error filtering

Text (latex) output PDF setting us letter or letter paper method

Open source software supply chain lighting plan, waiting for you>>>

There are some conferences and journals (such as IEEE) that need to submit papers, especially when the final papers are submitted, the type of page size is us letter or

Letterpaper。 The default output PDF of latex and Tex is A4. This needs to be set

The setting method is as follows:

method 1 (recommended):

In option/configuration wizard/diagnosis, select miktex configuration

and then click execution mode to open a dialog box to configure all commands

(in the new version of latex, you can directly find (execution modes) under options)

change the paper size and orientation options corresponding to dvi2ps, dvipdf and other accessories to letter

Method 2:

Open DOS window and use dvipdfm – P letter*

[solution note] tex insert PDF compilation prompt “no boundingbox”

Geeks, please accept the hero post of 2021 Microsoft x Intel hacking contest>>>

This is a problem encountered in submitting a revised version of an article recently. Two PDF images have been inserted into the Tex file. The local compilation is pdflatex, and there is no problem. However, the journal submission system is pdftex, which can’t process PDF images

pdfTeX Error: Cannot determine size of graphic in xxx.pdf (no BoundingBox).

After a simple search, it is basically to remove the options added when referring to the graphicx macro package or compile with pdflatex instead. But I didn’t add this item, and the online compiler can’t change the engine, so I can only choose the third scheme: set the required box when inserting the PDF image

First, you need to know the size of the PDF to insert. Open the PDF file with VIM at the terminal, and there will be a line indicating the box range of the PDF in front of a lot of incomprehensible garbled codes. The line of a file I used is as follows:

/MediaBox [ 0 0 929.10375 471.60375 ] /Annots [ ] /Resources 8 0 R

So insert the PDF image:

\begin{figure}
\centering
\includegraphics[width=\textwidth, bb=0 0 930 475]{xx.pdf}
\caption{xxxxx\label{fig:xx}}
\end{figure}

Among them, the width setting takes up the page space, and the range of box setting with BB should be a little larger than the box given by PDF. The specific value depends on the actual effect. Note that if the box height is too small, the top of the picture may block the top header of the page. At this time, the fourth parameter (indicating how high the box is) can be set larger

What are the recommendations for pdf editor? Try this convenient and easy to use editing software

Knowledge map advanced must read: read how large-scale map data efficient storage and retrieval>>>

Do you have any suggestions about pdf editor?Xiaobian thinks that everyone can try this convenient and easy-to-use editing software. When they are studying and working on weekdays, are they worried that they can’t edit PDF files when they use PDF files to view them?

In fact, if you want to view and edit PDF files, Xiaobian recommends a PDF Editor for you. This editing software is convenient and easy to use, friends can try oh“ The “quick PDF Editor” can operate the PDF file as a whole, such as merging, segmentation, compression, watermark, encryption, etc; In addition, you can also change the original PDF Text, re edit the original, typeset, add new graphics, shapes, links, notes, stamps, music, etc.

Having said that, are you looking forward to this PDF Editor?Next, let’s talk about the advantages of this pdf editor.

Clean interface

The color of the software page is mainly composed of black, gray, white and blue, and the color matching is simple and generous.

Clear picture

For different resolutions, the display effect is optimized.

Freedom of proportion

Support PDF file display scale adjustment.

Simple operation and higher efficiency. Click the corresponding function button, you can use the corresponding function in the PDF page, the interface layout is easy to understand, and the page operation is simple and efficient.

Content editing

Pdf files can be operated as a whole, such as merging, segmentation, compression, watermarking, encryption and so on. In addition, you can also change the original content of the PDF file, re edit the original content, typesetting, add new graphics, shapes, links, notes, stamps, music, etc.

Do you have any suggestions about pdf editor?Xiaobian thinks that everyone can try this convenient and easy-to-use editing software. After reading the above content, have you been planted by “quick PDF Editor”?This convenient and easy-to-use pdf editor has more functions, just waiting for the friends to explore and understand.