Broad Network


Quick and Simple Way to convert docx document into HTML and CSS File, using LibreOffice Writer

By: Chrysanthus Date Published: 3 Feb 2025

Introduction

Many people produce their initial documents in a docx format. If a word processor such as LibreOffice Writer or Microsoft Word, can produce a docx document, and has the capacity to save the document as an HTML/CSS file, then an HTML/CSS file would be produced (from the docx document).

There are certain websites (platforms) or social media, that do not allow you, the blogger, to produce certain kinds of typography, such as superscript, subscript, table, etc. Yet these platforms expect you to present a webpage with all the necessary typography. The solution is to convert your docx document with all the necessary typography, into an HTML/CSS format (that is, into HTML/CSS file). However, doing so, by hand (manually) is tedious and very time consuming. You do not have that time! There are free and commercial applications out there, that do the conversion, but without presenting the typography in the original form, and with some lost of intended information. In particular, many of them remove consecutive spaces created by pressing the spacebar key of the keyboard repeatedly; replacing them with one space.

The original file (article) of interest, should have been saved in your home (or office) computer in docx format. With that, open the docx file. With LibreOffice Writer, a prepared opened docx document can be converted into HTML/CSS file, by saving it in HTML/CSS format. This is done from the menu bar of the opened docx document, with the following command sequence:

File | Save As. . .

In the dialog box that appears, choose your directory (folder) and a new name for the corresponding HTML/CSS file (if required), and then choose the file-type as "HTML Document (Writer)(.html)" . Then click the Save Button. And there you have it, your HTML/CSS equivalent document of your docx document.

You can then go to the website or platform expecting a web page from you, and insert the whole HTML/CSS document code (or sections of it), with all the necessary typography.

If the website (platform) expects you to present your document in sections, then copy corresponding sections from the HTML/CSS file code, and insert them in the appropriate sections at the website page.

Remember: in order to obtain the code of the HTML/CSS document file, open the HTML/CSS document in a Text Editor. Then do some copying from the text editor and pasting (inserting) at the website page.

And so, no need to spend a lot of time learning XML and docx coding, just to be able to produce an HTML/CSS file, from a docx file, for a document report or for a web page.

Authors at social network sites (blogging), now have the solution to their typography problems (superscripts, subscripts, table, etc.), which is this article. Such sites would need the copying of sections of HTML code from the HTML file produced as explain above; and then pasting the sections (or the whole code) appropriately, into the server web page, as HTML code, as allowed by the web server page.

Some Immediate Issues

The HTML/CSS file produced is not really how it should be. So there is need for some editing of the code. You, the reader, needs at least basic (beginner) knowledge in HTML and CSS in order to understand the rest of this article. The rest of this article uses LibreOffice Writer for illustration. Operations in Microsoft Word should be similar.

Consecutive Spacebar Spaces

You will notice that with this scheme, consecutive keyboard spacebar spaces are reduced to one spacebar space. The solution is to redo the spacebar spaces by inserting corresponding number of " " code, into the corresponding location in the HTML/CSS complete code in the text Editor.

Images

To have an image in a docx document while the docx document is opened, just select the icon of the image in its window directory by left-clicking it. Click "copy" in the pop-up menu that appears. Position the cursor where you want to paste the image in the opened docx document by right-clicking there. Click paste (Edit|Paste): and the drawing of the image will be in the opened docx document. In the meantime, the docx document can be saved, still in docx format. The image type should be the image type accepted by the website (platform) that requires the ultimate information. Most websites accept .png type images and .gif images.

When the docx document is saved as an "HTML Document (Writer)(.html)" file, which is still referred to as an HTML file, a separate file for the image is saved in the same directory as the ".html" file. There would still be the .docx file (if it was saved). If the name of the HTML file is "Illustrating the Principle.html" for example, then the image filename will be for example, "Illustrating the Principle_html_7b173026db363bcf.png"; that is, the name of the HTML file with some code string, e.g. "_html_7b173026db363bcf.png".

The HTML file and the image file, have to be uploaded to the same directory in the web server. If images are kept in a different directory at the web server, then the image has to be sent to that particular image directory at the web server, and the value of the src attribute in the HTML file, has to be preceded by the URL path to the image directory.

In the HTML/CSS file code, the image single tag ends with " />" and not just ">" . Allow such code like that. The major browsers consider both as the same.

Paragraphs

To create a paragraph in the docx document, just start typing. When the Enter Key of the keyboard is pressed, a paragraph is formed. Pressing the enter key again, introduces an empty paragraph with at least one internal blank line. In the HTML code it will be something like:

<p class="western" align="left"><br/>

<br/>

</p>

Any such unwanted empty paragraph with a blank line, will have to be removed from the complete HTML/CSS code, by just deleting the complete paragraph tag with its <br/> contents. Each normal paragraph element comes with some good margin above and below it, resulting in some natural space between consecutive paragraphs (without the need for a blank line or an empty paragraph, between consecutive paragraphs).

Line Break

Notice that the line break tag code in the HTML/CSS file, is the xhtml <br/>, instead of <br>. The pressing of the enter key should be replaced with <br> in a normal HTML/CSS code. If <br/> is found where <br> is expected in the resulting HTML code (same as HTML/CSS code), allow the <br/> as all major browsers still consider <br/> as <br> .

Replacing Opening and Closing Double Quotes, with "

The opening or closing double quote code may combined with a preceding byte code or a succeeding byte code, in the web page at the web-server, to produce strange symbols or characters for the user. So, at the docx document, replace all opening double quote with ", and all closing double quote with " . " is the double quote character at the text editor, typed from the keyboard.

Styles and Table

The HTML produced file, comes with a style sheet code in the head element of the HTML page. Table styles will be found there, in the head element. The trouble is that you may not be allowed to alter the style sheet at the web server; and you may just wish to insert just a few sections of the HTML code in the web page at the web server. So, include inline styles for tables, where necessary.

A table is created in the docx document, beginning by clicking the "Table" menu item in the menu bar, and continuing from there.

Unwanted Numbering of Headings

When the web page is displayed (in the home computer, for example), you may notice a heading like:

1. Introduction

The "1. " in front of "Introduction" may not have been wanted (by you). In the HTML code file, the code may be something like:

<ol>

    <ol>

        <li><h2 class="western" align="left">Introduction</h2>

    </ol>

</ol>

The solution is to remove all the unnecessary tags in the HTML code, allowing just:

<h2 class="western" align="left">Introduction</h2>

Hyperlinks

To inset a hyperlink (link) into the docx document, where the cursor is, begin by clicking Insert|Hyperlink. . . and continue from there. By so doing, unwanted code that would make the coding messy, would not be introduced in the HTML/CSS resulting file (code).

Formula

To have a mathematical formula in the docx document, begin with the commands, Insert|Object|Formula Object. . . and continue from there.

When the docx document is saved as an HTML file (HTML Document (Writer)(.html)), the formula will be saved as a .gif image. If the name of the HTML file is "Illustrating the Principle.html" for example, then the image filename will be for example, "Illustrating%20the%20Principle_html_30b118fbe54affac.gif"; that is, the name of the HTML file, with some code string. From there and then, treat the formula image file as an ordinary image file. Many browsers accept gif images.

Good Practices in order to have limited Unnecessary HTML/CSS Code

As seen from above, after producing the HTML/CSS file, there has to be some editing. When producing the docx document, good practices have to be followed in order to limit unnecessary code in the resulting HTML/CSS document (file). If good practices are not followed (as illustrated above), the editing of the resulting HTML/CSS code can be difficult.

Good practices can be summarized as follows: With the exception of the paragraph element, all other elements must be produced using the horizontal menu bar (or using the tools in the tool bar) when producing the docx document. Styling of an element in the opened docx document, is done by first selecting an element, then clicking the Style menu item of the horizontal menu bar, and continuing from there. The different sized headings for example, in the docx document, should be obtained, beginning by clicking the "Style" menu item in the Horizontal Menu Bar, and choosing the particular heading, after the string-text of interest must have been selected (highlighted). Formatting is done in a similar way, using the Format menu item. With all that, when the docx document is saved as an HTML-Document-(Writer)(.html) file, there will be little unnecessary code, and editing of the resulting HTML/CSS file, will be easy.

Conclusion

No need to spend a lot of time learning XML and docx internal coding, just to be able to produce an HTML/CSS file, for an HTML/CSS report or for a website (platform). Authors at some social network (blogging) sites, now have the solution to their typography problems (superscripts, subscripts, table, etc.), which is this article. Such sites would need the copying of sections of HTML code, from the HTML file produced, as explain above, and then pasting the sections (or the whole file code) appropriately into the server web page, as HTML/CSS code, as allowed by the web server page. If the web server page does not allow the injection of a style sheet into its HTML head element, then all the styles required should be made inline styles at the HTML elements, in the HTML file.

This HTML/CSS page can be produced from the start indirectly, without first producing the docx document, as follows: Just open an empty docx document, and then save it as an HTML-Document-(Writer)(.html) file. However, type the content of the document, following the good practices mentioned above, while the document is opened with LibreOffice Writer. When this document is saved again in the home or office computer, it would be saved as an HTML/CSS file, along with image files (if applicable). In that way, the HTML-Document-(Writer)(.html) opened document, serves as an HTML/CSS production tool. This article should also be seen as: How to Use a Modern Word Processor as an HTML/CSS production tool, for free, if the author has some basic (beginner) knowledge in HTML and CSS.

Thanks for reading.





Related Links

More Related Links

Cousins

Comments