2012年10月13日 星期六

Stocktotal Database 之一


Schema:

drop table if exists BalanceSheet;
drop table if exists IncomeStmt;
drop table if exists CashFlowStmt;
drop table if exists StockCode;

create table if not exists BalanceSheet
(
    creation_dt datetime default current_timestamp,
    stock_code text not null,
    report_type character(1) not null,
    report_date datetime not null,
    activity_date datetime not null,
    item text not null,
    number text,
    revision int default 0,
    unique (stock_code, report_type, report_date, activity_date, item, revision) on conflict ignore
);

create table if not exists IncomeStmt
(
    creation_dt datetime default current_timestamp,
    stock_code text not null,
    report_type character(1) not null,
    report_date datetime not null,
    activity_date datetime not null,
    item text not null,
    number text,
    revision int default 0,
    unique (stock_code, report_type, report_date, activity_date, item, revision) on conflict ignore
);

create table if not exists CashFlowStmt
(
    creation_dt datetime default current_timestamp,
    stock_code text not null,
    report_type character(1) not null,
    report_date datetime not null,
    activity_date datetime not null,
    item text not null,
    number text,
    revision int default 0,
    unique (stock_code, report_type, report_date, activity_date, item, revision) on conflict ignore
);

create table if not exists StockCode
(
    creation_dt datetime default current_timestamp,
    code text unique,
    name text unique,
    isin_code text unique,
    listing_date datetime,
    market_category text,
    industry_category text,
    cfi_code text
);



還沒為查詢建 index。因為 SQLite 沒有 stored procedures 可以用,所以也沒有建立 stored procedures 之必要。((將來考慮用 PostgreSQL 之類的 open source database solution))



研究:

SQLite 會不會爆掉?想太多!

select count(*) from BalanceSheet -- 7518726
select count(*) from IncomeStmt -- 3561990
select count(*) from CashFlowStmt -- 276730

參考:http://www.sqlite.org/limits.html
Maximum Number Of Rows In A Table 
The theoretical maximum number of rows in a table is 2^64. This limit is unreachable since the maximum database size of 14 terabytes will be reached first. A 14 terabytes database can hold no more than approximately 1e+13 rows, and then only if there are no indices and if each row contains very little data.


有意思的查詢:近十年,哪些股票自由現金流量表 (free cash flow) 皆為正?

select stock_code, count(*) as positive_fcf_count from
(
    select stock_code, activity_date, sum(number) as sum from CashFlowStmt
    where report_type = 'C'
    and strftime('%m-%d', activity_date) = '12-31'
    and strftime('%Y', activity_date) > '2001'
    and item in ('Operating', 'Financing')
    group by activity_date, stock_code
)
where sum > 0
group by stock_code
having positive_fcf_count >= 10

特別留意 SQLite 對 date/time 的處理方式,請參考 http://www.sqlite.org/lang_datefunc.html。許多 MSSQL 常用的 date/time functions,SQLite 也有其對應 functions。個人認為更直覺。

答案:共 92 家。((未驗算,1301果然在榜單))

沒有留言:

張貼留言