• Solutions
    • FERC XBRL Reporting
    • FDTA Financial Reporting
    • SEC Compliance
    • Windows Clipboard Management
    • Legato Scripting
  • Products
    • GoFiler Suite
    • XBRLworks
    • SEC Exhibit Explorer
    • SEC Extractor
    • Clipboard Scout
    • Legato
  • Education
    • Training
    • SEC and EDGAR Compliance
    • Legato Developers
  • Blog
  • Support
  • Skip to blog entries
  • Skip to archive page
  • Skip to right sidebar

Friday, February 22. 2019

LDC #124: Converting CSV to HTML using Legato

If you have ever wanted to format data from a database into a web format then you have experienced the added difficulty of working with HTML. For this week’s blog we are going to be taking a look at Legato’s HTML Writer object. This object allows us to create HTML coding with very little knowledge of HTML but allows us access to some advanced features of HTML like CSS. In order to show the HTML Writer object, we will convert a CSV file to HTML but the data source could easily be adapted to any file format or database.


Table

This script is meant to be an example so I will post the entire script and then discuss parts of it rather than discussing the script from a design standpoint. Here is the entire script:



#define ROWSPERPAGE 250

void main() {

    handle      h;
    string      tbl[][];
    string      footer, fnDest;
    int         row, col,
                rows, cols,
                srow,
                page, pages,
                mrows,
                rc;


    tbl = CSVReadTable("https://raw.githubusercontent.com/chadwickbureau/baseballdatabank/master/core/People.csv");
    rc = GetLastError();
    rows = ArrayGetAxisDepth(tbl);
    if (rows == 0) {
      MessageBox('x', "Couldn't read CSV file. (0x%08X)", rc);
      }
    cols = ArrayGetAxisDepth(tbl, AXIS_COL);

    fnDest = BrowseSaveFile("Save HTML...", "HTML Files|*.html;*.htm");
    if (fnDest == "") {
      return;
      }

    h = HTMLCreateWriterObject();
    HTMLSetDTD(h);
    HTMLAddHead(h, "Baseball Players", "Arial, Helvetica, Sans-serif", "9pt");
    HTMLSetCellStyle(h, "padding-top: 5px; padding-bottom: 5px; border-top-width: 1pt; border-top-color: black; border-top-style: solid;");
    ProgressOpen("Building HTML...");

    row = 1;
    page = 1;
    pages = (rows / ROWSPERPAGE) + 1;
    while (row < rows) {
      ProgressSetStatus("Writing row %d of %d", row, rows - 1);
      HTMLTableOpen(h);
      HTMLSetRowStyle(h, "font-weight: bold; text-align: center; text-transform: uppercase;");
      HTMLRowOpen(h);
      for (col = 0; col < cols; col++) {
        HTMLCellOpen(h);
        HTMLAddText(h, tbl[0][col]);
        HTMLCellClose(h);
        }
      HTMLRowClose(h);
      HTMLSetRowStyle(h);

      mrows = row + ROWSPERPAGE;
      if (mrows > rows) {
        mrows = rows;
        }

      srow = 0;
      for (row; row < mrows; row++) {
        ProgressUpdate(row, rows);
        if ((srow % 2) == 0) {
          HTMLSetRowStyle(h, "background-color: lightgrey; vertical-align: top;");
          }
        else {
          HTMLSetRowStyle(h, "background-color: white; vertical-align: top;");
          }
        HTMLRowOpen(h);
        for (col = 0; col < cols; col++) {
          HTMLCellOpen(h);
          HTMLAddText(h, tbl[row][col]);
          HTMLCellClose(h);
          }
        HTMLRowClose(h);
        srow++;
        }
      HTMLTableClose(h);

      HTMLSetBlockStyle(h, "font-size: 80%");
      HTMLAddPara(h, FormatString("Page %d of %d", page, pages), "right");
      HTMLSetBlockStyle(h);
      page++;
      }

    ProgressClose();

    HTMLSetBlockStyle(h, "font-size: 80%");
    footer = "Generated: " + FormatDate(GetLocalTime(), DS_MONTH_DAY_YEAR | DS_DATE_AT_TIME | DS_HHMMSS_12 | DS_INITIAL);
    HTMLAddPara(h, footer);

    HTMLAddFoot(h);
    HTMLWriterToFile(h, fnDest);
    RunProgram(fnDest);
    }


We start off with a define for the number of rows to display on a “page.” When testing the blog script I discovered that most web browsers don’t perform well with extremely large tables so I decided to make the script split the table into smaller ones. This define is how many rows will be placed in each table. With the globals out of the way, we can discuss the main function.



    handle      h;
    string      tbl[][];
    string      footer, fnDest;
    int         row, col,
                rows, cols,
                srow,
                page, pages,
                mrows,
                rc;


    tbl = CSVReadTable("https://raw.githubusercontent.com/chadwickbureau/baseballdatabank/master/core/People.csv");
    rc = GetLastError();
    rows = ArrayGetAxisDepth(tbl);
    if (rows == 0) {
      MessageBox('x', "Couldn't read CSV file. (0x%08X)", rc);
      }
    cols = ArrayGetAxisDepth(tbl, AXIS_COL);

    fnDest = BrowseSaveFile("Save HTML...", "HTML Files|*.html;*.htm");
    if (fnDest == "") {
      return;
      }


First, we define our variables for the function. We load our CSV file using CSVReadTable. The example uses a CSV file from an open source major league baseball database. Next, the script checks to make sure the table loaded properly before getting the number of rows of data and the number of columns. This section could easily be replaced with an ODBC call or other database interaction. Next we use BrowseSaveFile to get a filename for later. We will use this name as the destination HTML file.



    h = HTMLCreateWriterObject();
    HTMLSetDTD(h);
    HTMLAddHead(h, "Baseball Players", "Arial, Helvetica, Sans-serif", "9pt");
    HTMLSetCellStyle(h, "padding-top: 5px; padding-bottom: 5px; border-top-width: 1pt; border-top-color: black; border-top-style: solid;");
    ProgressOpen("Building HTML...");


These lines set up the HTML Writer object starting with the HTMLCreateWriterObject function. We then use the HTMLSetDTD to set the default DTD (this is HTML 5). We also use the HTMLAddHead function to add a title to the document as well as a default font. The next line sets the style for all table cells in the document. In our case, all our cells will have the same style so this call works perfectly. If you want specialty formatting for different cells in your table, you may need to set the style at different points in your code. The HTML Writer class also allows the use of CSS classes which is great if the HTML being written is going to be used on a website. For our sample, inline styles will suffice. Lastly, we open a progress window using the ProgressOpen function. With all of our preparations done we can move to the writing loop.



    row = 1;
    page = 1;
    pages = (rows / ROWSPERPAGE) + 1;
    while (row < rows) {
      ProgressSetStatus("Writing row %d of %d", row, rows - 1);
      HTMLTableOpen(h);
      HTMLSetRowStyle(h, "font-weight: bold; text-align: center; text-transform: uppercase;");
      HTMLRowOpen(h);
      for (col = 0; col < cols; col++) {
        HTMLCellOpen(h);
        HTMLAddText(h, tbl[0][col]);
        HTMLCellClose(h);
        }
      HTMLRowClose(h);
      HTMLSetRowStyle(h);

      mrows = row + ROWSPERPAGE;
      if (mrows > rows) {
        mrows = rows;
        }

      srow = 0;
      for (row; row < mrows; row++) {
        ProgressUpdate(row, rows);
        if ((srow % 2) == 0) {
          HTMLSetRowStyle(h, "background-color: lightgrey; vertical-align: top;");
          }
        else {
          HTMLSetRowStyle(h, "background-color: white; vertical-align: top;");
          }
        HTMLRowOpen(h);
        for (col = 0; col < cols; col++) {
          HTMLCellOpen(h);
          HTMLAddText(h, tbl[row][col]);
          HTMLCellClose(h);
          }
        HTMLRowClose(h);
        srow++;
        }
      HTMLTableClose(h);

      HTMLSetBlockStyle(h, "font-size: 80%");
      HTMLAddPara(h, FormatString("Page %d of %d", page, pages), "right");
      HTMLSetBlockStyle(h);
      page++;
      }


We are going to loop over all the rows in the table. We start by using HTMLTableOpen to open an HTML table. We could optionally provide a width and other style information here. We set the style of all the rows using the HTMLSetRowStyle function. We want our table to have a heading row so the style here is different. We then use HTMLRowOpen and iterate over the columns to write out the first row as column headers. We use HTMLCellOpen, HTMLAddText, and HTMLCellClose to add content to the cells. We then close the row using HTMLRowClose and clear the style for the following rows by calling the HTMLSetRowStyle function again with no parameters.


Now we have written a heading row to the document we need to write the data rows. We calculate which row we should stop on using our define. Then we iterate over the data rows. The srow variable is used to make sure each table row is striped properly. The code for the data cells is very similar. We set the style for the row, open the cell, write the cell text, close the cell and close the row. When we are completely done we close the table using the HTMLTableClose function. Now we can add a footer after each table saying what “page” we are on. We use the HTMLSetBlockStyle function to reduce the font size and the HTMLAddPara to add the text. Then reset the block style and increase the page count before looping again.


Now that all the data has been added to the HTML file we can write it all out.



    ProgressClose();

    HTMLSetBlockStyle(h, "font-size: 80%");
    footer = "Generated: " + FormatDate(GetLocalTime(), DS_MONTH_DAY_YEAR | DS_DATE_AT_TIME | DS_HHMMSS_12 | DS_INITIAL);
    HTMLAddPara(h, footer);

    HTMLAddFoot(h);
    HTMLWriterToFile(h, fnDest);
    RunProgram(fnDest);
    }


We close the progress window and then add a footer to the entire document that says when the HTML was generated. We tell the HTML Writer to add the appropriate HTML closing tags and then write the whole thing to our file using the HTMLWriterToFile function. Lastly we use RunProgram to launch the document to the default web browser to see what we got.


As you can see the HTML Writer object allows use to quickly create HTML code without the use of templates or hardcoded HTML tagging. I didn’t use it for this blog but the HTML Writer class also supports writing tables with random access support. This means instead of writing rows and columns you can simply write cells using x and y positions. This can be used to write unsorted data in a sorted fashion. Whatever format your data is in, you can use Legato to help get it into another format for EDGAR, for a website, or even for catching up on stats for America’s favorite pastime.


 


David Theis has been developing software for Windows operating systems for over fifteen years. He has a Bachelor of Sciences in Computer Science from the Rochester Institute of Technology and co-founded Novaworks in 2006. He is the Vice President of Development and is one of the primary developers of GoFiler, a financial reporting software package designed to create and file EDGAR XML, HTML, and XBRL documents to the U.S. Securities and Exchange Commission.

Additional Resources

Novaworks’ Legato Resources

Legato Script Developers LinkedIn Group

Primer: An Introduction to Legato 



Posted by
David Theis
in Development at 17:19
Trackbacks
Trackback specific URI for this entry

No Trackbacks

Comments
Display comments as (Linear | Threaded)
No comments
The author does not allow comments to this entry

Quicksearch

Categories

  • XML Accounting
  • XML AICPA News
  • XML FASB News
  • XML GASB News
  • XML IASB News
  • XML Development
  • XML Events
  • XML FERC
  • XML eForms News
  • XML FERC Filing Help
  • XML Filing Technology
  • XML Information Technology
  • XML Investor Education
  • XML MSRB
  • XML EMMA News
  • XML FDTA
  • XML MSRB Filing Help
  • XML Novaworks News
  • XML GoFiler Online Updates
  • XML GoFiler Updates
  • XML XBRLworks Updates
  • XML SEC
  • XML Corporation Finance
  • XML DERA
  • XML EDGAR News
  • XML Investment Management
  • XML SEC Filing Help
  • XML XBRL
  • XML Data Quality Committee
  • XML GRIP Taxonomy
  • XML IFRS Taxonomy
  • XML US GAAP Taxonomy

Calendar

Back May '25 Forward
Mo Tu We Th Fr Sa Su
Friday, May 16. 2025
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31  

Feeds

  • XML
Sign Up Now
Get SEC news articles and blog posts delivered monthly to your inbox!
Based on the s9y Bulletproof template framework

Compliance

  • FERC
  • EDGAR
  • EMMA

Software

  • GoFiler Suite
  • SEC Exhibit Explorer
  • SEC Extractor
  • XBRLworks
  • Legato Scripting

Company

  • About Novaworks
  • News
  • Site Map
  • Support

Follow Us:

  • LinkedIn
  • YouTube
  • RSS
  • Newsletter
  • © 2024 Novaworks, LLC
  • Privacy
  • Terms of Use
  • Trademarks and Patents
  • Contact Us