Extract pages from PDFs on Android

PdfProcessor can export pages from one document into another document. You can choose to extract a single page, a range of pages, or even multiple page ranges:

// Page numbers start at 0. This range contains the fifth page of the document.
val task = PdfProcessorTask.fromDocument(document).keepPages(setOf(4))
// Keep pages 5, 6, and 7.
val task = PdfProcessorTask.fromDocument(document).keepPages(setOf(4, 5, 6))
// Remove the first page.
val task = PdfProcessorTask.fromDocument(document).removePages(setOf(0))

After creating PdfProcessorTask, you can start the extraction of the pages by calling the PdfProcessor#processDocumentAsync method or the PdfProcessor#processDocument method. Note that by default, all annotations will be preserved. You can queue multiple operations on a document by calling multiple methods on a PdfProcessorTask object before starting processing. The operations will be executed in the same order as your method calls:

val outputFile = File(getFilesDir(), "extracted-pages.pdf")
// Keep pages 5, 6, and 7.
val task = PdfProcessorTask.fromDocument(document).keepPages(setOf(4, 5, 6))
PdfProcessor.processDocumentAsync(task, outputFile)
// Run processing on the background thread.
.subscribeOn(Schedulers.io())
// Publish results on the main thread so we can update the UI.
.observeOn(AndroidSchedulers.mainThread())
.subscribe(
{ progress: PdfProcessor.ProcessorProgress -> Toast.makeText(context, "Processing page ${progress.pagesProcessed}/${progress.totalPages}", Toast.LENGTH_SHORT).show() },
{ error: Throwable -> Toast.makeText(context, "Processing has failed: ${error.message}", Toast.LENGTH_SHORT).show() },
{ Toast.makeText(context, "Processing has been completed successfully.", Toast.LENGTH_SHORT).show() }
)

💡 Tip: You can use page extraction to merge pages of two or more documents. All you need to do is load a compound PdfDocument — for example, by using PSPDFKit#openDocuments. Have a look at DocumentProcessingExample inside the Catalog app for a demo of this.