Saturday, May 12, 2012

Create running totals with data.

This is a fairly common requirement when reporting on data. The data analyst will typically extract data using sql and then use the SUM formula within Excel (with the aid of fixed and relative references, that are copied down). For example, supposing we are looking at a table that records number of starters by dates. We could have data like the following:
"record_date";"yr1starter"
"2004-01-01";"820"
"2004-01-02";"1891"
"2004-01-03";"1667"
"2004-01-04";"1607"
"2004-01-05";"1542"
We copy this into a spreadsheet (normally by using text-to-columns so the data is in separate columns). In order to get a running balance we can then use the following formula: =SUM(B$2:B2) for cell C2. We then copy down, so for instance the formula in cell C3 becomes: =SUM(B$2:B3) It's not the only way to do this in Excel - you could always add the subtotal above to the current record field, but that runs the danger of a gap in the figures. The attached graphic shows what the cumulative formula list will look like
Excel will normally display the calculations for example:
yr1starters cumulative total
820 820
1891 2711
1667 4378
1607 5985
1542 7527
However, it is possible to create cumulative/running totals in SQL (via an inner query, normally called a sub-query). I will first show this using MySql syntax:
SET @runtotal:=0;
SELECT qry1.r_date, qry1.day_starts,
(@runtotal := @runtotal + qry1.day_starts) AS runningtotal
FROM
(SELECT
record_date AS r_date, yr1starts AS day_starts
FROM startstops
WHERE record_date between '2004-01-01' AND '2004-01-31'
ORDER BY record_date) AS qry1
The key is to set the initial variable @runtototal to 0, so it can then total up the running total (the ORDER BY clause at the end of the select statement is also important). Incidentally, this query (and other ways to achieve it using Mysql) can be seen at StackOverflow. It is actually a bit more difficult to achieve this using sql*server. One way is to use a cursor, which I feel is OTT for this. Another way is to use a cross-join, and so join the table to itself. Performance can be an issue here if the data set is large (helped possibly by indexes)
SELECT a.record_date, a.yr1starter, SUM(b.yr1starter) AS running_total
FROM dbo.startstops a
CROSS JOIN dbo.startstops b
WHERE (b.record_date <= a.record_date)
AND a.record_date between '2004-01-01' AND '2004-01-31' AND b.record_date between '2004-01-01' AND '2004-01-31'
GROUP BY a.record_date,a.yr1starter ORDER BY a.record_date,a.yr1starter
You can read about different ways of calculating running totals in sql*server (including the usage of cursors) at the sqlteam link.

Tuesday, April 24, 2012

Day #11 of a #30 day startup

Well, I got the site up and running over the weekend; you can find it at ChatParla. You can change the langauge wherever you are by clicking English, Deutsche or Castellano (Spanish), Registration to post comments (and threads) is fairly easy, all it needs is a valid email address. I won't now be able to work exclusively on this project, so it's no longer valid to call it a #30 day startup! Of course there are further enhancements I will make to the site over the course of time eg an admin section, log-in facility with OpenId, further formatting changes etc. Apart from my own time, the only expense I've incurred in setting this up has been a £10 domain registration; I already have some hosting space to run the site. The language switcher as set up up in Cakephp is not too difficult to reproduce on other types of site, so if you want a call-logging or a web directory that needs different language functionality, please get in contact. I hope this set of entries has been of interest.

Saturday, April 21, 2012

Day #10 of a #30 day startup

Latest addition to the site functionality has been to enter a "slug" when viewing a thread and associated comments. The "slug" is the title you see on a url eg www.mysite.com/view/2:this-thread-about-London where the slug is "this-thread-about-London". It makes it easier to see what a thread is about from its link, and aids also in SEO terms. For those cakephp programmers who'd like to know how to implement this, I took notes mainly from generate unique slugs and from routing in the cookbook I'm constantly busy on the site at the moment. Some good news is that I have a domain name now
, and I hope to write more on this in the next blog article.

Friday, April 20, 2012

Day #9 of a #30 day startup


I've made further progress on the site, and have set up the code so that titles can be translated automatically into the respective language, so long as I've inserted those translations into the relevant .po files. The attached screenshot should give an idea of how it looks when viewing in another language. I think I may well get the site up and running early next week, and adapt the site according to feedback. I believe it will already be useful for my own surfing/commenting purposes. Later I hope to include embedded vids from sites like Vimeo or Youtube, but I'd like to get the conversation parts working first.
For those who are developing in Cakephp and want to implement a multi-language site here are two useful links : cakephp book and Localizing Japan.

Wednesday, April 18, 2012

Day #8 of a #30 day startup


A productive day today. I put in the language selector mechanism. By selecting the language a session id will stay in memory, so that if for example you chose the German option, all the comments you see when perusing threads would be the ones made in German. It's also easy enough to change to English or Spanish if you want to look at what people are saying in other languages. I'm not doing the same with the thread header - ideally people will comment on the same discussion point but in their respective language. One thing I will now have to do is modify the code to display titles and help-text in the relevant language, but that shouldn't be too hard to implement. I've got some other functionality planned for the site, but it's looking as though I can migrate to the website sometime next week!

Day #7 of a #30 day startup


Just a very short entry for day 7. I've got to the stage where there are so many code elements (eg new record screens, checking user is entitled to make entries) that need checking and tweaking, that I haven't a lot of time to write articles. Maybe a longer one tomorrow?
I suppose that is my lesson learnt for the day - coding can be long and time-consuming!

Monday, April 16, 2012

Day #6 of a #30 day startup


Today I got into the essential nitty-gritty of writing a web application - allowing registration and log-ins of users, and putting in place authentification of data entry. Luckily I've done this before, and there is also a very helpful section on authorization in the cakephp documentation which makes a lot of the coding fairly straight forward, although the layouts and screen markups still need to be altered to fit screens. Initially I won't be putting restrictions on whoever can register onto the system - indeed there may not be many at all! However I am leaving status codes in the users table as there may well be hackers who create users merely to post spam - at which point I will also utilise the use of such services like Stop forum spam. As stated, to begin with I intend making registration (and consequent posting) as easy as possible. But there will be validation and sanitization of data entered - I'm not quite so reckless!