Tag Archive for compromise

1: Structured Query Language

My first exposure to “programming” was writing database queries.

In 2006 I was asked to define a process for creating legitimate securities pricing data from unstructured text. The first challenge was to understand the corpus of text we’d be dealing with and how different geographic and asset class sources differed.

The same expression could mean different things based on its context and the shorter the expression the more difficult it is to define.  For example to a bond trader the letter “T” refers to a U.S. Treasury security; to an equity trader it means AT&T.  When that equity trader or a corporate bond trader sees the letters “WY” they see Weyerhauser while a municipal bond trader sees the State of Wyoming.

Given just the pure text it was nearly impossible to discern the correct context. Information about the source of the text was required.

We operated in a heavily matrixed environment.  I reported to through the head of product and technologists were in the R&D line. Three of the company’s most talented programmers were assigned to this project.  They had the huge task of ingesting documents (up to 500 per second) and identifying security references, prices and pricing convention (price, yield or spread to benchmark) for dozens of prices per document.  Since this data was to be used to drive investment decisions it had to be validated against other known sources.  Secure permission to view it had to be assigned to specific users and delivered to them within 100 milliseconds.  They had their hands full.

My job was to find out where to focus the parsing effort and to do that I needed metrics.  At first I asked programmers to provide reports on who produced these documents, from what region and what time of day.

Creating these reports for me took them away from the critical tasks they needed to complete.  The solution we agreed on was that they teach me how to make these reports myself.  A programmer named Joshwini showed up at my desk one day and installed a program on my PC called Oracle SQL Explorer. It had access to the tables of data they used to generate my reports. She configured it and went back to do her job.  I sat there looking at it and wondering what I was meant to do.  There was a box labelled “query” so I sent Joshwini an e-mail asking “what’s a query”.  She wrote back “SELECT * FROM TABLE”.  Her response to my follow-up questions was that it would be easier for me to Google my questions than to wait for her to get the chance to answer.  She was so right.