|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||
java.lang.Objectorg.pdfbox.util.PDFStreamEngine
org.pdfbox.util.PDFTextStripper
org.pdfbox.util.PDFTextStripperByArea
This will extract text from a specified region in the PDF.
| Field Summary |
| Fields inherited from class org.pdfbox.util.PDFTextStripper |
charactersByArticle, output |
| Constructor Summary | |
PDFTextStripperByArea()
Constructor. |
|
| Method Summary | |
void |
addRegion(String regionName,
Rectangle2D rect)
Add a new region to group text by. |
void |
extractRegions(PDPage page)
Process the page to extract the region text. |
protected void |
flushText()
This will print the text to the output stream. |
List |
getRegions()
Get the list of regions that have been setup. |
String |
getTextForRegion(String regionName)
Get the text for the region, this should be called after extractRegions(). |
protected void |
showCharacter(TextPosition text)
This will show add a character to the list of characters to be printed to the text file. |
| Methods inherited from class org.pdfbox.util.PDFStreamEngine |
getColorSpaces, getCurrentPage, getFonts, getGraphicsStack, getGraphicsState, getGraphicsStates, getResources, getTextLineMatrix, getTextMatrix, getXObjects, processOperator, processOperator, processStream, processSubStream, registerOperatorProcessor, resetEngine, setColorSpaces, setFonts, setGraphicsStack, setGraphicsState, setGraphicsStates, setTextLineMatrix, setTextMatrix, showString |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Constructor Detail |
public PDFTextStripperByArea()
throws IOException
IOException - If there is an error loading properties.| Method Detail |
public void addRegion(String regionName,
Rectangle2D rect)
regionName - The name of the region.rect - The rectangle area to retrieve the text from.public List getRegions()
public String getTextForRegion(String regionName)
regionName - The name of the region to get the text from.
public void extractRegions(PDPage page)
throws IOException
page - The page to extract the regions from.
IOException - If there is an error while extracting text.protected void showCharacter(TextPosition text)
showCharacter in class PDFTextStrippertext - The description of the character to display.
protected void flushText()
throws IOException
flushText in class PDFTextStripperIOException - If there is an error writing the text.
|
|||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||||