Bascially we are in need of choosing a suitable database/ETL tool for our Data warehouse requirement.
I would like to give you a brief on what is the requirement about.
Input to the system => SWIFT MESSAGES + INTERNALMESSAGES + EXTERNAL MESSAGES
The above inputs land in MQ Series
SOURCE(QUEUES)----->Target DB(May be Teradata or any other database )----->Reporting
These messages are completely online and their frequency is in terms of secs/milliseconds.
Currently they are running online reports by using the data in in-memory DB. They calculate
business info on the fly and show it in screen.
They need a kind of NEAR-REAL TIMe Datawarehouse to store the transformed data online with the source information as Messaging Queue.
They want to compare this real time transformed data with previous months transformed data i.e
they want to compare the historical information with the current transformed online data.
Curent source : MEssage Queues (where message is stored)
Volume : 3 million recs/day
Size in Bytes : 18 - 20 GB per day
Expected Growth in Vol : 18 million in next 3 years
Expected volume in bytes: 70 GB per day
We feel that the current OLTP database ORACLE 10g won't be suitable for building a new Dataware house with this volume. We need facts/details as to what parameters that we need to prove that ORACLE won't work out..
Latency provided to land to the target DB from source MEssage Queue is just 3 minutes.
I am feeling that ETL tool which is traditionally a Batch processing tools might not fit here..
I am suggesting JAVA Spring/Hibernate here to pull the messages and put it to the TARGET so that we get the transformed info immediately..
NOTE: They don;t have DWH currently to do all these operations