Login Page - Create Account

Support Board


Date/Time: Sun, 24 Nov 2024 02:54:00 +0000



Post From: Offering To The Community: Enhanced Intraday Data File Compression with Large Speedups

[2016-02-04 19:14:19]
bjohnson777 (Brett Johnson) - Posts: 284
*** This probably won't work anymore with the date change in SC 2150.

In 1bar mode, this program will essentially do the same as the built in Intraday Data File Compression but 10-20x faster.

Where it gets interesting is 4bar mode where each bar is split into 4 ticks for Open, High, Low, and Close. For some reason this mode will run 6x faster on my system (noting the compressed file size is 4x larger) than the usual 1bar compression.

Note that this is an external command line program (EXE) and not a SC plugin study (DLL). It's a bit more complicated to run. Details below. The source code compiles cleanly on my 32bit and 64bit systems (linux and win). If you don't want to compile it yourself, I've attached 2 EXE's for win platforms. They are self contained and should run cleanly. If you're using an ancient computer that is 32bit only, choose the 32bit file. If you're running a multi-core system bought within the past several years, choose the 64bit file. The 32bit file should run on both if there's a problem.

This program will keep the up/down volume ratios from tick by tick data similar to the SC built in function.

Support didn't think this was possible, but it works just fine with Ask/Bid Volume (SC_ASKVOL and SC_BIDVOL). I'll be updating my other studies I've posted to make this default in a few days.

I also programmed a 2bar version where up counts are one bar and down counts are the other. For some reason this causes SC to hang and barf. This is probably a SC bug that needs to be looked at. DO NOT use the 2bar output for now.

-----

Support:

Have a look at SCIDRecordWrite4Bars() just above main(). This is what I'm using to create the 4 OHLC bars. I've already provided the code, so this speed up needs to be integrated into the SC compress function. It should be easy.

The SCID file format page also needs some updating:
http://www.sierrachart.com/index.php?page=doc/doc_IntradayDataFileFormat.html

The data types should be size specific and not generic anymore. A long int may be 32bit or 64bit depending on the compiler. An int32_t or uint32_t will be the same on all compilers. Have a look at the top of my source file for what I'm using.

Also s_Record doesn't exist anymore on the doc page.

-----
Pieces from my file notes:


This program is also an example of a working SCID reader. If
Version 2 comes out, this program will likely need updating. Using
this program on another version will probably corrupt that version.

Building and running: This program uses specific sized data types and should
compile cleanly on 32bit and 64bit systems.

Linux:
g++ -O3 --static -o SC_CompressDataUnitSize SC_CompressDataUnitSize.cpp
Copy SC_CompressDataUnitSize to the SC Data directory.
Example: ./SC_CompressDataUnitSize -c4 -tm -u1 GBPUSD.scid.old GBPUSD.scid

Windows:
Change directory to C:\SierraChart\CPPCompiler\bin
Copy the SC_CompressDataUnitSize.cpp source file here.
Open a DOS window.
g++.exe -O3 --static -o SC_CompressDataUnitSize.exe SC_CompressDataUnitSize.cpp
Copy SC_CompressDataUnitSize.exe to the SC Data directory.
If you haven't already, fully exit SC to avoid causing data corruption.
While in the Data directory, rename what ever files you want to compress with
a ".old" extension. In this example GBPUSD.scid gets renamed to GBPUSD.scid.old.
If there is a problem, delete the bad SCID file (GBPUSD.scid in this example)
and rename GBPUSD.scid.old back to GBPUSD.scid.
Open a DOS window and run from the Data directory:
SC_CompressDataUnitSize.exe -c4 -tm -u1 GBPUSD.scid.old GBPUSD.scid

After this program finishes, use SC to export the data to CSV if necessary.
This program includes CSV exports (mainly for debugging), but SC will probably
export cleaner and more usable files.

This program aligns the output bars to the beginning of each time block.
This keeps bars aligned in non-market graphing programs and spreadsheets.

My ChartBook Load Speed Test:
Original 1 Tick SCID Size: 6g (around 4-5min to load)
smashed by SC = 80sec (49megs).
1bar = 80sec (48megs). This is essentially similar to smashed by SC.
2bar = ?sec (95megs). Hangs
4bar = 14sec (191megs).

Usage Screen:

Usage: SC_CompressDataUnitSize.exe -opts InFile.scid OutFile.scid
Program to compress Sierra Chart Version 1 SCID tick by tick files down
in size with different methods while preserving up and down volume counts.
This version is 10-20x faster than the built in SC function. The "-c4"
option also offers a 6x run time speed up than the traditional compression.

Options (-opts) start with the dash (-) character followed by a letter and
a number (replaces the #) with some options.
-c#: Bar Consolidation Type. 1 is similar to SC's built in function. 2 hangs SC
for some reason. It will produce an up and down bar for each time unit. 4 will
give the 6x speed up. It produces 4 bars for Open, High, Low, and Close ticks.
-t#: Time Prefix. Options are s for seconds, m for minutes, h for hours, and d
for days.
-u#: Time Units. The number of units for -t. "-tm -u1" would give 1min bars.
Note SCID intraday files can have a maximum bar length of 1 day. Anything over
that will be rounded down to 1 day. Use the daily CSV text format for anything
higher than a day. Watch out for uneven dividing of the time units into 1
trading day. This program does not convert time zones. Be careful with larger
time units.
-x#: Cut days older than # back. This is used for trimming down SCID files.
-y#: Do not process (just pass them through) ticks from the last # days.
Days back are CALENDAR days, not trading days. Watch out for weekends and
holidays. Usually give 3-4 extra days to account for that.
-r: Write out CSV file from the SCID input. Watch out for file size.
-R: Same as -r except more human readable for debugging.
-w: Write out CSV file from the SCID output. Watch out for file size.
-W: Same as -w except more human readable for debugging.
-d: Enable debugging mode. More output is given.
-B: Batch mode. Doesn't display the warning. Use with caution.

Time Options Examples: 1sec bars: -ts -u1. 30sec bars: -ts -u30.
1min bars: -ts -u60 or -tm -u1. 10min bars: -tm -u10.
45min bars: -tm -u45. 1hr bars: -tm -u60 or -th -u1.
4hr bars: -th -u4. 6hr bars: -th -u6. 1day bars: -td -u1.

Convert forex EUR/USD tick by tick data to fast 1min bars discarding anything
older than 30 days and not converting the past 7 days with debug CSV files:
SC_CompressDataUnitSize.exe -c4 -tm -u1 -x30 -y7 -R -W EURUSD.scid.old EURUSD.scid

Version 0.9 2016-02-03 GPL'd and Open Sourced by Brett Johnson

-----
List of my programs available on "Brett Johnson's Standard Tool Kit" DLL page.
Offering To The Community: Brett Johnson's Standard Tool Kit
Date Time Of Last Edit: 2020-09-16 09:38:42
attachmentSC_CompressDataUnitSize_32bit.exe - Attached On 2016-02-04 18:29:05 UTC - Size: 121.5 KB - 702 views
attachmentSC_CompressDataUnitSize_64bit.exe - Attached On 2016-02-04 18:29:10 UTC - Size: 189 KB - 573 views
attachmentSC_CompressDataUnitSize.cpp - Attached On 2016-02-04 19:01:35 UTC - Size: 45.4 KB - 786 views