Jun 25, 2013

Read .docx file as string

I found it after googling for a day, finally a function that reads the .docx file and return you a string. Hats off to the original author of this function.

function readDocx($file_name){
    $striped_content = '';
    $content = '';
    if(!$file_name || !file_exists($file_name))
        return false;

    $zip = zip_open($file_name);
    if (!$zip || is_numeric($zip))
        return false;

    while ($zip_entry = zip_read($zip)) {
        if (zip_entry_open($zip, $zip_entry) == FALSE)
            continue;
        if (zip_entry_name($zip_entry) != "word/document.xml")
            continue;
        $content .= zip_entry_read($zip_entry, zip_entry_filesize($zip_entry));
        zip_entry_close($zip_entry);
    }// end while
    zip_close($zip);

    $content = str_replace('</w:r></w:p></w:tc><w:tc>', " ", $content);
    $content = str_replace('</w:r></w:p>', "\r\n", $content);
    $striped_content = strip_tags($content);
    return $striped_content;
}

No comments:

Post a Comment

Want to tell something about this post. Please feel free to write...