Sorry your browser is not supported!

You are using an outdated browser that does not support modern web technologies, in order to use this site please update to a new browser.

Browsers supported include Chrome, FireFox, Safari, Opera, Internet Explorer 10+ or Microsoft Edge.

DarkBASIC Professional Discussion / Trying to load and edit a .CSV file in Dark Basic Pro

Author
Message
DVader
20
Years of Service
User Offline
Joined: 28th Jan 2004
Location:
Posted: 15th Aug 2008 03:54 Edited at: 15th Aug 2008 03:58
Hi, I am trying to help my brother with a web page he is trying to setup. He needs a program to read and edit .csv files, possibly making a new one out of two seperate csv files. He ultimately wants a practically automatic system that can download a .csv from one website, update it and send to another site.
Firstly, I think I will be able to do this in DB, but am not certain. I certainly don't want to learn any other languages at present, for some reason I particulary hate the style of most internet based languages I have seen, call me old school I suppose Started on a Spectrum, dabbled with the Amiga (Amos, Blitz) and finally found Dark Basic on the PC after a few years of trying to find an Amos equivalent on the PC.
I have wrote a basic bit of code to load up the .csv and print everything out counting the products as it goes. It works fine with one .csv file which is around 700k but fails on a second which is over 3mb.



This code works fine with the small .csv but not this large one. Is there a text size limit or anything like that? Any ideas would be really appreciated.

<center><img src="http://tinyurl.com/oz8mf"><center>

Attachments

Login to view attachments
DVader
20
Years of Service
User Offline
Joined: 28th Jan 2004
Location:
Posted: 15th Aug 2008 04:05 Edited at: 15th Aug 2008 04:11
Just in case you want to see the working one here is the file, just change the relevant text in the program to suit, or rename the file(as if you don't know ).
EDIT
BTW, I am not asking for help with the whole thing, the main thing I am wondering about now is why this .csv file does not work. Of course pointers on the rest would be useful

<center><img src="http://tinyurl.com/oz8mf"><center>

Attachments

Login to view attachments
Phaelax
DBPro Master
21
Years of Service
User Offline
Joined: 16th Apr 2003
Location: Metropia
Posted: 15th Aug 2008 06:38
What error do you get on the large file? Does it have more than 200k lines? What do you need to do with the file? Loading every line into an array may not be the best method. And if this isn't needed for a game but just a tool for a website, there's a dozen languages better suited for this task.


Olby
20
Years of Service
User Offline
Joined: 21st Aug 2003
Location:
Posted: 15th Aug 2008 11:14
Try PureBASIC - it will do this task hundred times better and faster.


ACER Aspire 5920G: Core2Duo 2.2GHZ, 2GB, GeForce 8600M GT 1280MB, DirectX10, DBPro 6.8
DVader
20
Years of Service
User Offline
Joined: 28th Jan 2004
Location:
Posted: 15th Aug 2008 14:57
I get a nice windows exception error as soon as it hits the same line in the csv file. If you use the first .csv file I have uploaded it crashes at the same point every time. If you load the smaller one it works great. The major difference between the two files is the product descriptions tend to be a lot larger. I was wondering if there is any sort of size limit for the Read String command.

<center><img src="http://tinyurl.com/oz8mf"><center>
IanM
Retired Moderator
21
Years of Service
User Offline
Joined: 11th Sep 2002
Location: In my moon base
Posted: 15th Aug 2008 15:03
I'd call this error 'running out of memory', which admittedly, DBPro doesn't deal with very well, but PureBasic would have exactly the same problem.

Why do you need to read the whole thing into memory before doing anything with the data? Can't you operate on a line-by-line basis?

DVader
20
Years of Service
User Offline
Joined: 28th Jan 2004
Location:
Posted: 15th Aug 2008 16:01
I am just testing if I can load the files in before I start going into it any further. Is there a way of increasing available memory,or does it always use the maximum available. I remember having an issue in AMOS and there was a command to change the amount of memory used.

<center><img src="http://tinyurl.com/oz8mf"><center>
Van B
Moderator
21
Years of Service
User Offline
Joined: 8th Oct 2002
Location: Sunnyvale
Posted: 15th Aug 2008 16:09
Can I ask what the FIND NEXT command is there for?

READ STRING does not need it, FIND NEXT is for scanning directories.

You should really load in each line at a time, make whatever changes you need to, then save the line out to a export file, no need to load in the whole thing unless your doing far more complex analysis on them. I don't see any reason why DB can't handle this - compared to learning a new language like PureBasic, getting DB to do this is far more straightforward.


Health, Ammo, and bacon and eggs!
DVader
20
Years of Service
User Offline
Joined: 28th Jan 2004
Location:
Posted: 15th Aug 2008 17:19
A mistake on my part, I though it moved to the next line of text. Took it out and see it still works. However that has not helped because now I cannot see how I can control which line I read now.

<center><img src="http://tinyurl.com/oz8mf"><center>
Van B
Moderator
21
Years of Service
User Offline
Joined: 8th Oct 2002
Location: Sunnyvale
Posted: 15th Aug 2008 17:26 Edited at: 15th Aug 2008 17:26
Well I'd usually just use FILE END...


open to read 1,"products.csv"
count=0
While FILE END(1)=0
READ STRING 1,entirescript$(count)
inc count,1
Endwhile
Close file 1

File end will equal 1 when it reaches the end of the file, and READ STRING will simply read the next string from the specified file.


Health, Ammo, and bacon and eggs!
Attila
FPSC Reloaded TGC Backer
19
Years of Service
User Offline
Joined: 17th Aug 2004
Location:
Posted: 15th Aug 2008 19:03 Edited at: 15th Aug 2008 19:04
READ STRING read one line (one record) from a file, lines are delimited by CR+LF (chr(13)+chr(10)) on windows. Edit your file with notepad to see if there are line delimiters in it. The line length is limited by the maximal length a string can be in DarkBasic (I think its 256 chars).

When using WRITE STRING in DB the string is automatically delimited with CR+LF, such strings can easily be used for READ STRING.

Some operating systems and programming languages (they just think everything is a stream...) do not know about file formats and therefore do not know records. But in a windows environment a file contains records. As a database contains tables and a table rows, a database row is basically a record. The processing level of DB is a record which then can be divided in data fields the size from one byte up to the max size of a string (using mid,left,right-command).
When using records and data fields, you do never need a pointer because you address your data by it's name (variable name) and not by the location and a pointer to point at a location.
.
IanM
Retired Moderator
21
Years of Service
User Offline
Joined: 11th Sep 2002
Location: In my moon base
Posted: 15th Aug 2008 19:33
@DVader,
You have almost 2GB of usable memory on a standard windows installation - you can't increase this with DBPro. You'll also find that as you get closer to this limit, your disk activity will increase as other programs/data are paged to disk to allow your data to be placed in memory - things will get slower and slower.

Also, you might want to take a look at my plug-ins. They don't have the size limitations of the DBPro file commands, and are faster too - there's also a command to append to an existing file.

So I suggest my file plug-in for file I/O, operating on one line at a time, and my string plug-in ( ) for splitting the CSV line into individual fields.

DVader
20
Years of Service
User Offline
Joined: 28th Jan 2004
Location:
Posted: 15th Aug 2008 20:35
It is definately the size of certain individual records that is causing the file to fail. Editing it and increasing the first record with a description to 3 times the size caused it to crash on the first record when loading into DB.

Thank you all for your help and advice. I am not sure about your plug in IanM, I have downloaded it and copied the directories over my db ones. Just got to find the commands and see if I can use them. If they have a bigger limit to size that will probably sort it out.

<center><img src="http://tinyurl.com/oz8mf"><center>
spooky
21
Years of Service
User Offline
Joined: 30th Aug 2002
Location: United Kingdom
Posted: 15th Aug 2008 23:16
There is a long term bug which still has not been fixed where READ STRING will fail if string is over 1331 characters long.

See here;

http://forum.thegamecreators.com/?m=forum_view&t=81894&b=15

Boo!
DVader
20
Years of Service
User Offline
Joined: 28th Jan 2004
Location:
Posted: 16th Aug 2008 14:00
Spooky, ah, that would explain it then, funny how I always seem to come across weird bugs like this in almost every project I do. I tried your plugin IanM, it has loaded the full csv up no problem . Not quite sure about your string commands though will have to play with those a bit.

<center><img src="http://tinyurl.com/oz8mf"><center>
IanM
Retired Moderator
21
Years of Service
User Offline
Joined: 11th Sep 2002
Location: In my moon base
Posted: 16th Aug 2008 15:58
You'll probably end up kissing the SPLIT STRING command once you've used it

DVader
20
Years of Service
User Offline
Joined: 28th Jan 2004
Location:
Posted: 16th Aug 2008 18:27
A quick question about saving the file back to disk. I am not sure how the WRITE DATAFILE STRING FileID, Value$ command works from the description. The ID I assumed wasthe same as in DB 1-32 but am not sure what the value is supposed to be. DATAFILE STRING TYPE sort of makes sense but I don't see how it relates to the previous command as suggested in the description. I am finding this an interesting distraction compared to doing games, but I must confess I normally avoid this sort of thing, normally using save array in games for data and such.

Any hints?

<center><img src="http://tinyurl.com/oz8mf"><center>
IanM
Retired Moderator
21
Years of Service
User Offline
Joined: 11th Sep 2002
Location: In my moon base
Posted: 16th Aug 2008 18:44
If you understand DBPro's file commands, then you understand mine.

Here's Van's code from earlier:


Here's the equivalent using my plug-in:

Conversion is simple and pretty much one-to-one for the DBPro commands - you simply change some command names, and change commands for reading into function calls.

When writing strings, instead of using WRITE STRING you'd use the WRITE DATAFILE STRING.

- OPEN DATAFILE TO WRITE will automatically clear the file and open it, while the DBPro OPEN TO WRITE will silently fail.
- OPEN DATAFILE TO UPDATE has no DBPro equivalent - it allows read/write of the file.
- OPEN DATAFILE TO APPEND has no DBPro equivalent - it allows reads from anywhere in the file, and writes only to the end of the file.
- DATAFILE STRING TYPE is used to tell the plug-in how your strings are stored in the file. If you can open your file using notepad without it looking bad (ie, it doesn't have squares where the line-breaks should be) then set a type of 1 when reading. Otherwise, leave it alone (it defaults to 0).

DVader
20
Years of Service
User Offline
Joined: 28th Jan 2004
Location:
Posted: 16th Aug 2008 18:58 Edited at: 16th Aug 2008 19:15
Here is what I have coded so far, almost got it while waiting for a reply

It seems to save the file but also seems to get stuck in a loop of death for some reason I can't quite see at the second. I will check out your above examples and see if they shed any light. Thanks.


EDIT
BTW I already had looked at vanb's example and adapted it, so cheers vanb!
I tested the output file and it loads into notepad and even open office calc looking ok. But still the loop of death it seems stuck in. If I decrease the max size of the array it says array out of bounds, although it has put a lot of blanks at the end. Something silly is causing it I am sure.

<center><img src="http://tinyurl.com/oz8mf"><center>
IanM
Retired Moderator
21
Years of Service
User Offline
Joined: 11th Sep 2002
Location: In my moon base
Posted: 16th Aug 2008 19:26


DVader
20
Years of Service
User Offline
Joined: 28th Jan 2004
Location:
Posted: 16th Aug 2008 19:54 Edited at: 17th Aug 2008 03:03
Ah nice one thanks, I fudged it earlier with this, but will change it to yours as it is a little more elegant



EDIT
Old fashioned way of doing it as such but worked Changed it to your code now working great.

<center><img src="http://tinyurl.com/oz8mf"><center>
DVader
20
Years of Service
User Offline
Joined: 28th Jan 2004
Location:
Posted: 21st Aug 2008 01:18 Edited at: 21st Aug 2008 04:48
Gah, have looked at your commands for splitting strings they look good. But I cannot really see now to use them. Have you any examples? I am thinking about using normal db commands and the left$ command at the second because I am unable to get your commands to work, but prefer the look of your commands for easier coding(I assume).
Here is the code I have tried to use, I get the following error
Cannot perform 'integer' cast on type 'TEMP3'
this points to the s(n-1) = get split word$(n)line.



I am a little confused about the delimter string. Can I use more than one? If so how exactly? Also can I use " as a delimter to differentiate between text and variables?
Sorry if I am taking up a lot of your time with this, but now I have started using these commands I would like to use these too.

EDIT

Ha! looked again and noticed my stupid mistake!


So sorted now, works just as I hoped. Now Hopefully, I can finish this off and get back to a game! BTW the question about the multiple delimiter still applies. Do something like this?

split string fullcsv$(1), ",:;"

Would check for all the above characters? Will probably have a look later on and test something, but I am now gonna sort out this csv! Cheers for all the help everyone, I have learned a few new things doing this little project.
Anyone thinking about downloading ianM's extra commands, but are put off becuase they are a little confusing at first, take it from me, I am not a brilliant programmer by any means, see all the above post . But I have only took a day or 2 to do this little thing and not really spent lot's of time on it. If I can use the commands, anyone can So definately try it!

http://s6.bitefight.org/c.php?uid=103081
IanM
Retired Moderator
21
Years of Service
User Offline
Joined: 11th Sep 2002
Location: In my moon base
Posted: 21st Aug 2008 15:13
You can specify as many single-character delimiters in the delimiter string as you need. For instance, I recently used that to split out values from lines formatted like so: key1 = value1, key2 = value2, key3 = value3

I have decided to put together a specific command to split correctly formatted CSV lines (as per the Excel spec) as using the existing command makes them difficult to deal with correctly. You can't currently have commas within a field for example - Excel would put double-quotes around the field.

IanM
Retired Moderator
21
Years of Service
User Offline
Joined: 11th Sep 2002
Location: In my moon base
Posted: 29th Aug 2008 00:59 Edited at: 29th Aug 2008 01:12
Well, I don't know exactly what you wanted to do with the file, but here's an example of the new SPLIT CSV STRING command that you can tweak to do what you want:


Important: When writing to a datafile, my plug-in plays it safe and flushes all writes to disk immediately. I've switched that off in this code because it isn't needed, so the code runs many times faster than it normally would with it enabled - it flushes only when the write-buffer is full.

[EDIT]Changed line-ending on the output file to CRLF so that the file looks right when opened with notepad.

Visigoth
19
Years of Service
User Offline
Joined: 8th Jan 2005
Location: Bakersfield, California
Posted: 30th Aug 2008 12:57 Edited at: 30th Aug 2008 13:03
I've been working on something where I have to read .csv files, and found the TOKEN() commands to be very useful. By using a 2 dimension array, you can make a "database" that you can access each field in the file. Sample of how token() works:

IanM
Retired Moderator
21
Years of Service
User Offline
Joined: 11th Sep 2002
Location: In my moon base
Posted: 30th Aug 2008 13:45 Edited at: 30th Aug 2008 14:44
The token functions are very useful and fast, but there are a couple of weaknesses in them.

Firstly, they destroy your input string - not too much of a problem as long as you remember it happens.

Secondly, they skip blank fields.

If you can guarantee that your data doesn't have any blank fields, then this isn't a problem - if your data does or can have blank fields then you're out of luck.

Thirdly, they don't deal with more complex CSV data - they don't deal with commas in quoted strings, or coalesce two quotes into a single quote within quoted strings, or remove leading/trailing spaces from unquoted strings - all these can appear in an Excel-saved CSV file and by extension, any application that is modelled on it or saves similar data types (OpenOffice/calc for example).

[EDIT]In fact, I've just found a bug in my new SPLIT CSV STRING command, which just goes to show how many little things can trip you up.

Login to post a reply

Server time is: 2024-05-04 10:48:44
Your offset time is: 2024-05-04 10:48:44