Thursday, August 27, 2015

Railroad History is Deja Vu All Over Again

Train crashes, passenger deaths, grade-crossing accidents, derailments.  These are not just the recent history of Metro-North, but events dating back decades.
A friend of mine, Paul Zajkowsky, loves reading the microfilms of our town’s weekly newspaper and has been feeding me clippings of all the stories about the New Haven line.  The news reports sound all too familiar.
APRIL 1939:           Two lads on their way to hunt bullfrogs are almost killed crossing the railroad tracks near Noroton Heights.
JANUARY 1944:     An empty train heading to New Haven smashes into a stopped local train at Darien, creating an explosion heard two miles away.  The engineer is killed and 14 passengers are injured.
FEBRUARY 1944 (and many subsequent dates):     A delivery truck runs off the road at the Hoyt St crossing on the New Canaan branch, gets caught on the tracks and drove into an oncoming train.
JANUARY 1949:     A ten-year-old playing with a length of wire comes in contact with the 11,000-volt overhead caternary.  The resulting flash frightens but does not harm him.
DECEMBER 1954:  A Darien man exits Ernie’s Tavern and clambers up an embankment to cross the railroad tracks.  Struck by a NY-bound express, he suffers a broken leg.
JULY 1955:             Service has become so bad on the New Haven RR that Norman Cousins, editor of the Saturday Review, petitions the Interstate Commerce Commission to fine the railroad “as a hazard to public safety”. Cousins complains that trains are so crowded that a dozen passengers must ride standing in the vestibule while others ride in the washrooms.
JULY 1955:             New Haven RR President Patrick McGinnis tells the Norwalk Chamber of Commerce that local commuters will now have to pay $5 a month for station parking.  “I’m a businessman,” he tells them. “I’m not the Ford Foundation”.
OCTOBER 1955:     A flash flood washes out tracks near Noroton Heights just as a 78-car freight train passes thru at 35 mph, derailing 23 of the cars and causing $10 million in damage.  Round-the-clock repairs continue for months.
FEBRUARY 1957:   A Bridgeport man, running to catch a train, sees it pulling out of the station and chases after it.  Grabbing the door of the last car, he’s unable to board and falls to the tracks.
JANUARY 1958:     Within hours of each other, two locomotives catch fire in the Park Avenue tunnel, shutting down all train service in and out of Grand Central.
JULY 1958:             The AM commute is disrupted when 11 cars of a freight train derails near Bridgeport.

DECEMBER 1958:  Citing mounting losses, the New Haven RR threatens to eliminate commuter service unless it receives $900,000 from CT counties and gets NY to waive $1 million in taxes on Grand Central.  Railroad President George Alpert warns “I do not propose longer to peril the New Haven RR by subsidizing NYC, Westchester and Connecticut”.  (The NHRR goes into bankruptcy (for the second time) in 1961.)

Monday, August 24, 2015

Stupid Programming Trick No. 25 part A – Hybrid allocations the Wild but Crazy way

Let’s talk about allocations of the most simple kind

Before I go any further, I must give credit to Tim German aka Cube Coder aka @CubeCoderDotCom. He and I came up with this approach as part of our Kscope15 presentation ASO, BSO, and Hybrid Calculations:  What’s Fastest, How Hard Is It, and Which One Should You Use?.  Never fear, this isn’t merely a rehash of that presentation as I go into quite a bit more detail in this post.  Also, this is the Fast, Cheap, & Out of Control straw man proposal.  The next post will be (hopefully) quite a bit more sane.

To set the stage, this allocation post will:
  • Use a Hybridized version of Sample.Basic
  • Allocate the indirect Budget cost Distribution from a pool amount
  • Use a driver based on Actual Sales level 0 Product and Market as a percentage of Actual Sales Total Product and Total Market

I (and the world+dog of Essbase developers) have written allocations like this since the Year Dot.  I purposely chose this example because it is so basic and yet pervasive.  Never underestimate how easy Essbase makes this sort of logic.  I think sometimes we lose track of the still-revolutionary functionality of Essbase.  But I digress for the eleventy billionth time in the life of this blog.

Danger, danger, danger

The title of this post contains “Wild and Crazy” and what I will show you will make your eyes pop out albeit not in a SNL 1970s Golden Age kind of way.  Nope, this one made my brain hurt when I wrote it.  I suppose that’s all a long way of saying that what I going to illustrate you works but you almost certainly should never, ever, ever use this technique.  Think of it as a laboratory experiment that shows an interesting approach but is impractical in the real world.  I have been burnt by too many other consultants’ “genius” to not want to touch this approach with a 40 foot barge pole and I don’t want you to go this way either.  

I write this post to illustrate how this could be done and only that.  And then to provide an approach to contrast in the next not-completely-insane-approach blog post.

You Have Been Warned.

But first what is the logic?

The goal is to allocate an indirect cost, in this case the Budget cost of distribution, to all level 0 Products and Markets.  Yes, this is Sample.Basic yet again, and why not?

Expenses are entered at a level 0 in a pool member

The Budget data points to be allocated are at level 0 in each dimension. Note that No Market, No Product, and Pool are non-consolidating members specifically used to contain the data and not aggregate.

Calculate percentages

The allocation percentages are an interim step, normally never viewed.

The allocated and aggregated results

Ta-da, Budget Distribution is now fully allocated.

With the logic out of the way, let’s delve into how this actually works across the various engines.

The BSO way

Outline

Surely you are familiar with this.  Certainly the readers of this blog are.

Assuming an already aggregated Sample.Basic, the code is as below.

Broadly, the code is as follows:
  1. FIX at level 0 Product and Market as well as period.
  2. use the percentage calculated from level 0 Product and Market sales divided by Actual Total Market and Total Sales.
  3. Aggregate the Market and Product dimensions.

As I wrote, easy peasy lemon squeezy.

The problem with Hybrid

Hybrid is great, Hybrid is fast, Hybrid is the best of BSO and ASO, Hybrid is the bee’s knees, Hybrid is the berries and yet at the same time, Hybrid sucks.  What?  Mr-OMG-I-love-Hybrid has a problem with the tool?  Actually, yes I do.  As Maxwell Smart would say, “Missed it by that much”.

What’s the problem?  The whole raison d’etre behind Hybrid is that it uses an ASO engine wherever dynamic sparse member calculations exist; those are then dynamically aggregate via ASO.  This gets around the the problem BSO has with scaling in the form of calculation time and disk space, the  gruesome-to-some MDX, and the opaqueness of ASO calculations scripts.  

Disk size isn’t much of a big deal today in the face of multi-terabyte hard drives (if this blog is around in five years, how we’ll all laugh at that as “big”) but a large collection of .PAG files is indicative of a long aggregation time; time is a constant as measured by us on Earth.  BSO’s genius is its retrieval speed (none of the engines beat it save for attributes) but its Achilles Heel is its inability to scale to larger databases.
If the ASO component of Hybrid is the solution to scaling, what’s the problem?  Those upper level Hybrid-enabled queries only work in a retrieval tool (Smart View, FR, etc.) and in MDX; FR only takes advantage of Hybrid if MDX is the retrieval language.  Where they don’t work is in calc scripts as of 11.1.2.4.  IOW, if you were to write a calc script that requires an upper level cross-dimensional reference that is retrieved via the Hybrid engine, i.e., just like this allocation process requires, the calc script will validate, and will work, but when that calc script refers to that upper level member, Hybrid reverts to true BSO sparse dynamic calculations.  This can be A Very Slow Thing (almost as long as that lovely run-on sentence above) and most certainly Something You Don’t Want.

As of 11.1.2.4.00x, there are only two ways to get ASO-drive Hybrid upper level data out of Essbase:
  • Run a query in Smart View/FR (again using MDX only)/etc.
  • Use MDX to get the data out – remember, neither DATAEXPORT or Essbase report scripts will export Hybrid upper level data.

I never promised you a rose garden when it comes to a clean approach.  This is a Stupid Hack after all.

What to do, what to do, what to do?

The answer is decidedly blue.  (If you’re not a fan of the American Songbook, and you clicked on the link to hear that, and you’re still not a fan of the American Songbook, there is really no hope for you.  And if you aren’t a fan of the Velvet Fog and King Cole, well, I’m so flabbergasted I can’t even rant about it.)   

It’s all in the wrist

The Stupid Trick is to load upper level data into level 0.  By that I mean: retrieve data from an upper level intersection and then load that data into a level 0 bucket.  Then when the allocation process occurs, the calculations are all level 0 BSO.  The beauty of this approach is that no aggregation (other than what Hybrid performs via its ASO engine) is required so in a budgeting scenario, the turnaround between  input and allocation and retrieval of the spread data should be very quick.

Peas in a pod

The only difference in the Hybrid world is the upper level sparse members.  Where they were stored in the BSO Sample.Basic in SampleH.Basic those members are set to dynamic calc.

Getting it out with a MDX crowbar

MDX is many things – powerful, flexible, a language with a fetish for parenthesis, square brackets, and curly braces – but one thing it is not is a good data extractor.  It works, but you’ll not be well pleased with the result.

Before I go into that particular rant – no worries, it’s coming – let’s take a look at this query.

Not that bad

I am, to be charitable, not very good at MDX but even yr. obt. svt. was able to figure this one out.

After setting display properties (and giving you my VM’s username and password), the query:
  1. Selects Sales
  2. Puts periods in the rows
  3. Restricts the retrieval to Actual, total Products, and total Market.

Quite bad, actually

If I had a $, £, ¢, R, etc. for every time I heard someone complain about MDX output over on Network54 I might not be a very rich man, but I’d at least be able to afford a better brand of Scotch.

Look at this.  Lord love a duck, this is gruesome.  And the gruesomeness is about to get much worse.

I don’t want header information, or the query repeated in the spool file, or column headers, or query success messages.  This isn’t too much of a good thing, it’s just too much and most if that I don’t want.

I want doesn’t get

If Essbase is going to read the file it must be cleaned up and made fit for purpose via a parse of the file.  I am going to use a commonly used language in the EPM world – VBA in the form of VBScript to do so but as you will see it’s not an ideal tool.  Read it, marvel and its awfulness, and think about how you would do it in Perl or Groovy or Jython or just about anything else.

VBA est omnis divisa in partes tres


Unfortunately, I am not as pithy as Julius Caesar but I do my bit.  Also, I wish I taken Latin classes, Classical (not Demotic as that might actually be practical) Greek, and more than just a few courses in philosophy.  At least I get to be a singularly focused autodidact in practically every aspect of my life.  And have a job.

A fish rots from the head down

Per Microsoft,
A Windows script (*.wsf) file is a text document containing Extensible Markup Language (XML) code. It incorporates several features that offer you increased scripting flexibility. Because Windows script files are not engine-specific, they can contain script from any Windows Script compatible scripting engine. They act as a container.

In the case of this script the header defines two required parameters:  the name of the MaxL file to be parsed and the name of the output text file.

‘Cos I can be a bit of stickler when it comes to code, I’ve included the optional <example> node.

Lastly, the script language, VBScript, is defined.

The Mystery Meat

The MaxL MDX query output is beat-it-with-a-stick ugly.  So is this code albeit better formatted.  At least I destroy all of my instantiated objects at the end of the code.

Feet do your duty

Interestingly (to some at least), the actual execution of the code is controlled at the end of the script.  Why?  Why not at the top?  Damfino.

In any case, the footer reads the arguments, ensures that they are valued, and then calls the Main procedure.

Call it names

Running it isn’t so awful and once written, runs a treat.  I’ve even named the parameters and echo their values.  Ah, that pat on the back feels so good.

The output we’ve all been waiting for

An awful lot of pain for not much return.  At least it’s readable, although it still requires a tweak or two for Essbase.

But at least the stupid thing is readable by Essbase.  Almost.

Glenn, this is the moment you have been waiting for

Does success taste best when served cold?  Yes, MMIC, I am going to use an Essbase Load Rule and I am going to use it to do ETL.  Oh the shame, oh the ignominy, oh the hack, oh the hypocrisy.  

Yep, there it is – I’m hardcoding (gasp) the dimensions that aren’t defined in the from-the-Hell-of-MDX-query file to load those upper level members into level 0 buckets.

And now the calc

My oh my oh my we are finally here.  The upper level data from that MDX query has now been loaded to level 0 buckets and the allocation code – which looks almost exactly like the BSO code ‘cos that’s what it is – can use those totals to perform the allocation.

No aggregation is needed as this is Hybrid.  ASO on top of BSO is the dream come true.

Wot’ll she do, mister?

While all of this has been great fun, this level of complexity better be worth the candle.  Is it?  After all, Sample.Basic isn’t exactly a large database – there is no Slow with that wee beastie.

Using the same Essbase database that Tim German and I have used for the past two years – 40,000 customers against 70,000 products along with the typical Planning dimensions, the Hybrid numbers speak for themselves.
Engine
Performance in seconds
BSO
117
Hybrid
6

Yowsa.  Remember that 117 seconds assumes an already aggregated database.  So if there were to follow the typical Planning approach of enter, agg, allocate, agg we’re looking at something more like four minutes.  Four minutes vs. six seconds is an eternity when a business rule is run on form save.

In the Tree, part of the Tree

So is this the way forward?  Yes and no.

Is this fast?  Yes.

Will you ever use this particular approach?  No.  At least I hope so.

The next post in this Stupid Trick will show a much saner, much simpler, and much more likely approach to allocations in Hybrid.

Be seeing you.

Sunday, August 16, 2015

“The Fairest (and Least Popular) Way To Pay for Roads”



Back in April I wrote about the challenge we face to pay for Gov. Malloy’s $100 billion transportation plan.  And I expressed sympathy for his bipartisan, blue-ribbon panel tasked with coming up with funding alternatives, the Transportation Finance Panel.
To be honest, I think that panel may be on a fool’s errand.  They’re trying to pay for a wish list of projects not of their making and many of which may not be necessary let alone affordable.  Maybe we only need $50 billion.  But it’s not their mandate to question our “transportation Governor”.  Someone else will have to do the “vetting.”
But even as the Finance Panel does its work, exploring all manner of funding options, they are being second-guessed by politicians and public alike.
How about tolls?  Too expensive… they’ll slow traffic… and don’t forget those flaming truck
Hi speed toll collection... NO booths!

crashes at toll barriers!  (Not true… no they won’t… and there won’t be toll barriers).
Gas tax?  Unfair… out-of-state motorists won’t pay… improved gas mileage means dwindling revenue.  (Totally fair… maybe so… and absolutely correct).
Which brings us to what would seem to be the fairest, most equitable fund raising mechanism for paying for our roads, but which brought a bipartisan crap-storm of response when suggested:  a mileage tax, or VMT (vehicle miles traveled) tax.
The concept is simple:  have each motorist pay a tax for the number of miles he/she drives each year.  The data could be collected electronically by GPS or with an odometer check when you get your annual emissions inspection.  You drive more, you pay more… whether you drive on I-95 or back-country roads.  Take mass transit, you’d drive less and pay less.
The VMT idea was discussed at the Finance Panel’s July 29th meeting, and the public and political reaction was immediate and universally negative.
Senate Majority Leader Bob Duff (D-Norwalk) called it “unproven”, despite successful trials in the Netherlands and Oregon and VMT’s endorsement by the US Government Accountability Office.
Republican St Senator Toni Boucher calls VMT nonsensical and an invasion of privacy, though testimony proved both claims wrong.
Face it:  nobody likes a tax that they have to pay.  Tax the other guy… the trucker, the
out-of-state driver, the real estate transferor… but don’t tax me! 
Driving a car is not free.  Paying for gasoline is only part of the cost and even Connecticut’s relatively high gas tax comes nowhere near to paying for upkeep of our roads. Our deteriorating roads are a hidden toll as we pay for car repairs.
The Transportation Finance Panel will find there is no easy or popular solution to paying for the Governor’s $100 billion untested and unattainable wish list of projects.  Whatever they recommend, citizens will scream bloody murder and their lawmakers will vote it down.
But shame on reactionaries in Hartford for calling the VMT, or any funding alternative, “dead on arrival”.  Let’s at least let the Finance Panel do its diligence before saying they have wasted their time.

Popular Posts