Word of the week

Love Bubbles

Categories

Microsoft time estimation?

Why does the time estimation progress bar in Windows differ radically from reality?

I think it may just be “Microsoft Standard Time” – a completely separate time zone to any that exist on this planet, and they also use it to schedule their delivery timetable as well… :)

Actually, its a computational issue. The problem arises around what you know versus what you need to know. all things being equal, if you have taken 1 minute to do 10% of everything you would expect to finish in 9 more minutes. (1 minute for 10% = 10 minutes for 100%)

You could also argue that any “Processing overhead” would be catered for in the averages for the work you have just done, but unfortunately not. For example, zipping files provides for an exciting working case.

  • The size if the file you are adding has an effect on the time it takes to compress it.
  • The compressability of the file is also a factor, JPEG images for example, do not compress well.
  • The number of files of the same type/name/make-up

The version of zipping I was using had some interesting issues, that I’m not sure if its a ZIP thin, or an implementation thing. When you add a file with the same name as an existing file in the archive, the originals are unpacked, compared and re-zipped. The spikes on this graph show that.

file addition timings

I also tried randomising the file list so that directories of images and directories of large backup files were distributed through the run, this did not produce much in the way of more accurate results, and basically just moved some of the peaks a little.

In any event, you end up with something like the following graph for the entire run.

Raw data

So, how could the graph be improved?

Step 1: Smoothing.
One of the easier approaches to clean dirty data is to smooth it. This involved taking an average of the last few sets of timings. While this will not ‘boost’ the missing data it will give you a better platform to make any alterations on. You end up with this.

Average smoothing

Step 2: Limiting.
A better approach would be to take the lowest time so far. As time progresses, you would think that the amount of time remaining would always go down, however where something takes quite a bit longer, it can skew your averages, and mean that the raw timings actually increase.. you’ve all seen it… 10 seconds to finish copying, 9 seconds, 8 seconds, 23 seconds… :)

What you are doing as far as the user sees is stopping the countdown, still not a great thing, but potentially better than the disheartening time doubling effect.

Average and minimum

Showing this a litter clearer

Close-up of average and minimum

Step 3: Factoring.
We can combat a smoothed minimum because of the deviance over time. If you start at the beginning by, say, doubling the time estimate, and then as time progresses and the value gets more accurate reduce this to not factor at all, you can get quite a good match once things get going.

Overall time line

this gives you only a little deviance from the actual progress we would like, throughout the entire run.

Deviance from expected

Mind you, if Microsoft were to implement this, then the maths involved would double the amount of time it would take anyhow. :)

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>