DocumentReader
This library reads word documents (.doc and .docx), txt and PDF files, and gives the output content of the document as a String.
If you have ever tried to read contents of a PDF or MS word document on Android, you know how painful it is. This library makes your work easy.
Dependency for build.gradle (Project level)
repositories {
...
maven { url 'https://jitpack.io' }
}
Dependency for build.gradle (Module: app)
dependencies {
....
implementation 'com.github.Asutosh11:DocumentReader:0.12'
// NOTE: use this only if you get a multidex exception
implementation "androidx.multidex:multidex:2.0.1"
}
// NOTE: use this only if you get an error like - More than one file was found with OS independent path
packagingOptions {
exclude 'META-INF/DEPENDENCIES'
exclude 'META-INF/INDEX.LIST'
exclude 'META-INF/spring.handlers'
exclude 'META-INF/spring.schemas'
exclude 'META-INF/cxf/bus-extensions.txt'
}
// NOTE: use this only if you get a multidex exception
defaultConfig {
...
multiDexEnabled true
}
How to use it?
// Read a pdf file from Uri
val docString : String = DocumentReaderUtil.readPdfFromUri(fileUri, applicationContext)
// Read a pdf file from File
val docString : String = DocumentReaderUtil.readPdfFromFile(file, applicationContext)
// read a doc file from Uri
val docString : String = DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
// read a doc file from File
val docString : String = DocumentReaderUtil.readWordDocFromFile(file, applicationContext)
// read a docx file from Uri
val docString : String = DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
// read a docx file from File
val docString : String = DocumentReaderUtil.readWordDocFromFile(file, applicationContext)
// read a txt file from Uri
val docString : String = DocumentReaderUtil.readTxtFromUri(fileUri, applicationContext)
/*
Even if you don't know your file type,
this library detects the file mime type and gives you the content of the file as a String
*/
val docString : String = when (DocumentReaderUtil.getMimeType(fileUri, applicationContext)) {
"text/plain" -> DocumentReaderUtil.readTxtFromUri(fileUri, applicationContext)
"application/pdf" -> DocumentReaderUtil.readPdfFromUri(fileUri, applicationContext)
"application/msword" -> DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
"application/vnd.openxmlformats-officedocument.wordprocessingml.document" ->
DocumentReaderUtil.readWordDocFromUri(fileUri, applicationContext)
else -> ""
}
Thanks
The Apache Tika projectApache's PdfBox port by TomRoush