Code Newbie
News     Forums     Search     Members     Sign Up    

My Code Newbie
Username

Password

Articles/Snippets
ASP Classic
ASP.NET
C
C#
C++
HTML / CSS
Java
Javascript
Linux / BSD
Perl
PHP
Python
Ruby
SQL
VB 6
VB.NET

C.N. Friends
  Planet Rome

Link to Us!
Code Newbie
  Code Newbie
    forums
Old 08-06-2006, 01:29 PM   #1 (permalink)
DJMaze
Senior Contributor
 
DJMaze's Avatar
 
Join Date: Mar 2005
Posts: 672
DJMaze is on a distinguished road
Thumbs up [TIP] Easy filesize to human readable

You ever wanted to have nice and clean filesize viewing?
I've seen many solutions but they are very bloated for just that simple task, hack I've even seen 72 lines of code in an article, that i still don't understand, on codeproject: http://www.codeproject.com/cpp/formatsize.asp

The trick is to know your <math.h>
WikiPedia properly describes which powers are used.

So how do we know in which range we are?
This is simple to understand if you've read the WikiPedia article.
KB = 1024
MB = 1024*1024
GB = 1024*1024*1024
TB.....

PHP Code:
#include <math.h>
#include <stdio.h>

// Bytes, Kilo, Mega, Giga, Tera, Peta, Exa, Zeta and Yotta
static const charhumanreadablebytes[] = {"B","KB","MB","GB","TB","PB","EB","ZB","YB"};

int main(void)
{
    
// the filesize to calculate
    
unsigned long filesize 22825;
    
// calculate the humanreadable bytes index
    
int i = (int) floorlog(filesize) / log(1024) );
    
// output the data
    
printf("filesize: %.02f %s\n"filesize/pow(1024i), humanreadablebytes[i]);
    return 
0;

output:
Code:
filesize: 22.29 KB
log Calculates the natural logarithm.
pow Calculates x to the power of y.

Basicly speaking "log() / log()" is the reversion of pow().
More information can be read here: http://en.wikipedia.org/wiki/Pow

You see? Only 4 lines of code to calculate a clean human readable output.

Want a rough estimated one that rounds up/down if the ammount is >= 1000?
PHP Code:
    // the filesize to calculate
    
unsigned long filesize 1062793248;
    
// calculate the humanreadable bytes index
    
int i = (int) floorlog(filesize) / log(1024) );
    
// Get a rougher size to prevent large outputs
    
if (filesize/pow(1024i) >= 1000) { ++i; }
    
// Prevent unknown format
    
if (8) { i=8; }
    
// output the data
    
printf_d("%.02f %s\n"filesize/pow(1024i), humanreadablebytes[i]); 
output:
Code:
filesize: 0.99 GB
Just my $0.02 to make your life easier.
__________________

UT: Ultra-kill... God like!
DJMaze is offline   Reply With Quote
Old 08-06-2006, 07:44 PM   #2 (permalink)
AssKoala
Anti-Zealot
 
AssKoala's Avatar
 
Join Date: Feb 2006
Location: Atlanta, GA
Posts: 72
AssKoala is on a distinguished road
Send a message via AIM to AssKoala Send a message via MSN to AssKoala Send a message via Yahoo to AssKoala
Exclamation Buffer Overflow

Cool idea, but that's far too complicated to be of good use like that.

Take this as constructive criticism.

Your implementation makes error checking more complicated than this function really needs. Basically, you would have to implement a more proper algorithm to do this in reverse to ensure that if your array of formats (KB, MB, etc) stops early, your algorithm will still work.

In other words, this is a better, in terms of stability, way to do it (the code is crude, I threw this together and tested it in about 10 minutes):

Code:
char *mem_types[] = { "B", "KB", "MB", "GB" };
int numTypes = sizeof(mem_types)/sizeof(char*);

double getHRSize(double fileSize_Bytes, int base, int *type) {
    double HRSize = fileSize_Bytes;
    int baseLevel = 0;
    
    for (;baseLevel < numTypes-1 && fileSize_Bytes >= base;baseLevel++) {
        fileSize_Bytes /= base;   
        HRSize = fileSize_Bytes;
    }
    
    *type = baseLevel;     
    
    return HRSize;
    
}
With example usage:
Code:
const int bibiBytebase = 1024;
const int byteBase = 1000;

void outputHR(double fileSize) {
    
    double HRSize;
    int type;
    
    HRSize = getHRSize(fileSize, bibiBytebase, &type);
    
    std::cout << "FileSize: " << HRSize << " " << mem_types[type] << std::endl;
    
    
}
So, with that code, it will not crash regardless of how big the input filesize is. If you used the above memory types (mem_types[]) in your code, you would read outside of your array for anything larger than 1023 GB.

Your algorithm is actually very elegant, you can quickly find out just how many divisions by the base are necessary which will directly correlate to the human readable format. The problem is, you would then have to do the algorithm I just posted in reverse to get it into a format that your types array supports. In other words, the above algorithm supports human readable types indefinitely (or as long as the data types will hold out).

There is also the problem of efficiency. Your algorithm requires two logs and raising to a power. One of the two logs (log(1024)) can theoretically be optimized out by the compiler, but chances are it won't be.

The direct approach (what I posted) will, in general, require only a few iterations (less than 4). It will scale linearly, so at a higher number of iterations, yours would be superior assuming an indefinitely long memory types list. However, it could be modified to use your algorithm to "jump" to the solution, then reverse solve into a solution that fits into the memory types array.

This is overkill, but somewhat important considering program security nowadays. Plus, on many systems, you might not have access to math.h, be dissallowed/discouraged from using math.h, lack a fast implementation of math.h, etc etc.
__________________
If you always think like an expert, you'll always be a beginner. | "A handful of knowledgeable people is more effective than an army of fools" -Writing Secure Code, 2nd Ed.
AssKoala is offline   Reply With Quote
Old 08-07-2006, 01:14 AM   #3 (permalink)
DJMaze
Senior Contributor
 
DJMaze's Avatar
 
Join Date: Mar 2005
Posts: 672
DJMaze is on a distinguished road
Quote:
Originally Posted by AssKoala
So, with that code, it will not crash regardless of how big the input filesize is. If you used the above memory types (mem_types[]) in your code, you would read outside of your array for anything larger than 1023 GB.
Yes i should have posted:
PHP Code:
    // the filesize to calculate
    
double filesize 22825
I didn't because Windows mostly uses DWORD for filesizes and DWORD is unsigned long, so Windows will crash on terrabytes anyway.

Yours is slick as well, i've used a variant of that many times.
PHP Code:
while (fileSize_Bytes >= base)
{
    
fileSize_Bytes /= base;
    ++
baseLevel;
}
if (
baseLevel 8baseLevel 8
It does the same but also infinite.

I just wanted one using math.h and try to reduce the ammount of code used, and i must say it can't be less then 4 lines
__________________

UT: Ultra-kill... God like!
DJMaze is offline   Reply With Quote
Old 08-07-2006, 05:08 AM   #4 (permalink)
AssKoala
Anti-Zealot
 
AssKoala's Avatar
 
Join Date: Feb 2006
Location: Atlanta, GA
Posts: 72
AssKoala is on a distinguished road
Send a message via AIM to AssKoala Send a message via MSN to AssKoala Send a message via Yahoo to AssKoala
Quote:
Originally Posted by DJMaze
I didn't because Windows mostly uses DWORD for filesizes and DWORD is unsigned long, so Windows will crash on terrabytes anyway.
Well, Windows uses two DWORD's to represent the File Size. Check out the documentation for "GetFileSize()". Maximum file size (assuming crash at a file that's too large) is 2^64 bytes on 32-bit systems.

Not that that matters since the File System will determine the maximum size

Quote:
I just wanted one using math.h and try to reduce the ammount of code used, and i must say it can't be less then 4 lines
Oh, don't get me wrong, yours is elegant and a fancy way to do it. I like it
__________________
If you always think like an expert, you'll always be a beginner. | "A handful of knowledgeable people is more effective than an army of fools" -Writing Secure Code, 2nd Ed.
AssKoala is offline   Reply With Quote
Old 08-08-2006, 10:20 AM   #5 (permalink)
toe_cutter
Code Monkey
 
Join Date: Aug 2002
Location: Boston, MA
Posts: 79
toe_cutter is on a distinguished road
Send a message via ICQ to toe_cutter Send a message via AIM to toe_cutter Send a message via Yahoo to toe_cutter
ls -lh
__________________
toe_cutter is offline   Reply With Quote
Old 08-08-2006, 11:10 AM   #6 (permalink)
redhead
Newbie
 
redhead's Avatar
 
Join Date: Jun 2002
Location: Denmark
Posts: 1,705
redhead is on a distinguished road
Or stat() and then use the provided st_size.
But since this can't be dealth with in a ISO/ANSI way, I guess that was the reason I didn't comment on it in the first place...
__________________
Don't worry Ma'am, We're university students, We know what We're doing.
-----
If you pull the pin, Mr.Grenade would no longer be your friend.
-----
01000111 01101111 00100000 01000011 00100000 00100001
redhead is offline   Reply With Quote
Old 08-08-2006, 05:37 PM   #7 (permalink)
AssKoala
Anti-Zealot
 
AssKoala's Avatar
 
Join Date: Feb 2006
Location: Atlanta, GA
Posts: 72
AssKoala is on a distinguished road
Send a message via AIM to AssKoala Send a message via MSN to AssKoala Send a message via Yahoo to AssKoala
Quote:
Originally Posted by toe_cutter
ls -lh
Seeing as this is the C++ forum, that's just a stupid comment.

Only a fool would execute a separate application like ls and use a parser to extract the file size data for what can be done in 10 lines of code.

Nevermind that it's OS specific, succeptable to problems if ls's output changes, etc. So,

Quote:
Originally Posted by redhead
Or stat() and then use the provided st_size.
But since this can't be dealth with in a ISO/ANSI way, I guess that was the reason I didn't comment on it in the first place...
That has no relevance to what DJMaze posted or what I responded about.

The purpose of the post was to convert the file size to a human readable number. The structure provided by stat only provides the file size in bytes in the form of the member st_size. It has no bearing whatsoever on the post.

Now, had he said here's an ANSI way to obtain the file size:

Code:
int fileSize(FILE *fp) {
    int size=0,c=getc(fp);
    for (;c!=EOF;c=getc(fp),size++);
    return size;
}
Then responding with functions like stat(), GetFileSize(), GetFileSizeEx(), read_vnode() or whatever other OS specific functions would apply would make sense.
__________________
If you always think like an expert, you'll always be a beginner. | "A handful of knowledgeable people is more effective than an army of fools" -Writing Secure Code, 2nd Ed.
AssKoala is offline   Reply With Quote
Old 08-08-2006, 07:42 PM   #8 (permalink)
redhead
Newbie
 
redhead's Avatar
 
Join Date: Jun 2002
Location: Denmark
Posts: 1,705
redhead is on a distinguished road
Quote:
That has no relevance to what DJMaze posted or what I responded about.
My my, are we touchy about this...
True DjMs orriginal post has nothing OS specific in it, but discussing DWORD sizes and calls to ls are quite the opposite, and since my response was regarding toe_cutter, I see no need to question my response..
__________________
Don't worry Ma'am, We're university students, We know what We're doing.
-----
If you pull the pin, Mr.Grenade would no longer be your friend.
-----
01000111 01101111 00100000 01000011 00100000 00100001
redhead is offline   Reply With Quote
Old 08-09-2006, 12:28 PM   #9 (permalink)
AssKoala
Anti-Zealot
 
AssKoala's Avatar
 
Join Date: Feb 2006
Location: Atlanta, GA
Posts: 72
AssKoala is on a distinguished road
Send a message via AIM to AssKoala Send a message via MSN to AssKoala Send a message via Yahoo to AssKoala
Quote:
Originally Posted by redhead
... but discussing DWORD sizes ...
Not quite, seeing as my response to DJM's post was regarding potential overflows due to file sizes.

Quote:
Originally Posted by redhead
since my response was regarding toe_cutter, I see no need to question my response..
I'm not sure how capable you are in a forum, but general forum etiquette is to quote the person you are responding to. This forum doesn't have a hierchy of responses; if it did, then quoting wouldn't be strictly necessary.
__________________
If you always think like an expert, you'll always be a beginner. | "A handful of knowledgeable people is more effective than an army of fools" -Writing Secure Code, 2nd Ed.
AssKoala is offline   Reply With Quote
Old 08-10-2006, 06:27 AM   #10 (permalink)
redhead
Newbie
 
redhead's Avatar
 
Join Date: Jun 2002
Location: Denmark
Posts: 1,705
redhead is on a distinguished road
Quote:
This forum doesn't have a hierchy of responses;
It is called viewing thread in lineary mode...
__________________
Don't worry Ma'am, We're university students, We know what We're doing.
-----
If you pull the pin, Mr.Grenade would no longer be your friend.
-----
01000111 01101111 00100000 01000011 00100000 00100001
redhead is offline   Reply With Quote
Old 08-22-2006, 06:34 AM   #11 (permalink)
toe_cutter
Code Monkey
 
Join Date: Aug 2002
Location: Boston, MA
Posts: 79
toe_cutter is on a distinguished road
Send a message via ICQ to toe_cutter Send a message via AIM to toe_cutter Send a message via Yahoo to toe_cutter
Quote:
Originally Posted by AssKoala
Seeing as this is the C++ forum, that's just a stupid comment.

Only a fool would execute a separate application like ls and use a parser to extract the file size data for what can be done in 10 lines of code.

Nevermind that it's OS specific, succeptable to problems if ls's output changes, etc. So,
I never suggested writing a parser because the out put is already in human readable form. I believe in not reinventing the wheel.

But you are right this is a C++ forums, and it is OS specific, but then again you could run cygwin and it would still work.

TC
__________________
toe_cutter is offline   Reply With Quote
Reply

Bookmarks

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On



All times are GMT -8. The time now is 04:21 PM.


Powered by vBulletin® Version 3.7.0
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.0.0 RC8





Copyright © 2000-2008, Milano Interactive
Web Hosting provided by Portal 360 Web Hosting