Data Warehouse And Its Applications In Agriculture

 Step 7: Define the duration of database and
DATA WAREHOUSE AND ITS APPLICATIONS INperiodicity of updation.
AGRICULTUREStep 8: Track slowly the change in dimension.
 Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  Dr. Satish R.Step 9: Decide the query priorities and query
Kolhe                modes.
AssistantAll the above steps are required before the data
 Â Â Â Â Â Â Â Â warehousing is implemented. The final step or
 Â Â Â Â Â Â Â Â Â step 10 is implemented a simple data warehouse
Gf's GCOEor data mart. The approach should be ‘from
 Â Â Â Â Â  NMUsimpler to complex'. First only a few data marts
Jalgaon                         are identified, designed and implemented. A data
 Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â         warehouse then will emerge gradually.
 Let us discuss the above mentioned steps in
A Data warehouse is a repository of integrateddetail. Interaction with the user is essential for
information, available for queries and analysis. obtaining answers to many questions. The user to
Data and information are extracted frombe interviewed includes top management, middle
heterogeneous sources as they are generated. management, executives, as also operational
This makes it much easier and more efficient tousers, in addition to sales force and marketing
run queries over data that originally came fromteams. A clear picture emerges from the entire
different sources. In other words Data warehouseproject on the data warehousing as to what are
is a database that is used to hold data fortheir problems and how they can be possibly
reporting and analysis. solved with the help of data warehousing.
 Economic foundation and productivity growth4.  Various Technology Considerations
depends on agricultural sectors. Agriculture is theThe following or technological issues [3] are
driving force behind the way of live and source ofrequired to be considered for designing and
earnings for the majority of peoples. More thanimplementing a data warehouse:
60 percents of population are living in rural areas1. The hardware platforms for Data Warehouse
and the majority are farmers. The rural2. DBMS for supporting data warehouse
communities as a main producer for country food3. Communication and network infrastructure for
productivity and food security earn only 11a Data Warehouse
percents of Gross Domestic Product (GDP). The4. The system management /operating system
arrival of information age guides this country toplatforms
new development strategies.5. The software tools for building, operating and
National Electronics and Computer Technologyusing Data Warehouse
Center (NECTEC) in collaboration with the Ministry  
of Agriculture, has launched "AgricultureHardware Platform
Information Network" as a response to theOrganization normally tend to utilize the already
unmet information requirements of the agriculturalexisting hardware platform for data warehouse
sector. Farmers should gain benefit from thedevelopment however the disk storage
contents provided which include risk assessment,requirements for a data warehouse will be
agriculture warning system and agriculturalsignificantly large, especially in comparison with
knowledge base, which aim to improvesingle application.
technology, productivity, income and stability ofIf data warehouse or data mart is small in data
India agriculture sector through the age ofsize, normal Pentium server will be probably
Information Technology. The data warehousesufficient with not very high reliability standards.
consists of common databases and geo-spatialHowever for a regular large data warehouse
databases from various departments andapplication the server has to be specialized for the
organizations in the country and abroad. Farmerstasks associated with a data warehouse. A
can get access to the contents through Internetmainframe, for example is well suited for this
by themselves or from groups of professionalpurpose, as a data warehouse server. What are
people called "Information Brokers".the features required for a successful data
 warehouse server? Firstly it should be able to
Keywords: Data Warehouse, Agriculture, ITsupport large data volume and complex query
 processing. In addition, it has to be highly scalable.
 As the user population keeps on growing, the
1.    Introductionnetwork traffic and the access traffic increase
A  Data  warehouse [1] is  a  repository  of significantly. Therefore, the requirement of data
integrated  information,  available  for warehouse server is the scalable high
queries  and analysis.  Data  and  information performance for data loading and ad hoc query
are  extracted  from  heterogeneous processing as well as the ability to support large
sources  as  they  are generated.  This database in a reliable and efficient manner. If the
makes  it  much  easier  and  more querying is going to be on a large public data
efficient  to  run  queries  over  data  thatnetwork then multiprocessor configuration will be
originally came from  different  sources.  Inrequired for parallel query processing. In case of a
other words Data warehouse is a database thatcomplex server of configuration with multiple
is used to hold data for reporting and analysis. processors and large I/O bandwidth a proper
  balance needs to be made between I/O and
Goals of Data Warehousingprocessing power.
- To facilitate reporting as well as analysis 
- Maintain an organizations historical informationDBMS Selection
- Be an adaptive and resilient source ofNext to hardware solutions a factor most critical
informationis the database selection. This determines the
- Be the foundation for decision makingspeed performance of the data warehousing
  environment. The requirement Of a DBMS for
Data Warehouse Architecturedata warehousing and requirement are scalability
Data warehouse Architecture comprises ofand high volume storage and processing and
- Operational source systemsthroughput in traffic.
- A data staging area            The majority of established
- One or more conformed data martsRDBMS vendors have implemented various
- A data warehouse databasedegree of parallelism in their products. Even
 though all the vendors have implemented various
Operational Source Systemsdegrees of parallelism in their products. Even
Operational  source  systems [1]  are though all the well known vendors-IBM, ORACLE
developed  to  capture  and  process SYBASE-support parallel database processing,
original  business transactions.  These some of them have improved their architectures
systems  are  designed  for  data  entry, so as to better suit the specialized requirement of
not  for  reporting,  but  it  is  from here thethe data warehouse. The RDBMS products
data in data warehouse gets populated.provide additional modules for OLAP cubes. The
 correct choice of OLAP server DB server and
Data Staging Areaweb server can be made by the designer or user
Data staging area  is where  the  rawof Data warehouse depending on the requirement.
operational  data is  extracted,  cleaned,  
transformed and combined  so  that  it  can Communication and Networking Infrastructure
be  reported  on  and  queried  by  users. Data warehouse can be internet enabled or
This area lies between the operational sourceintranet enabled as the choice may be. If web
systems and the user database and is typicallyenabled the networking is taken care by the
not accessible to users.internet. If only Intranet based then the
 appropriate LAN operational environment should
Data staging is a major process that includes thebe accessible to all the identified users. Thus
following sub procedures:network expansion may be required as per the
- Extractionneeds. In web enabled data warehouses, issues of
The extract  step  is  the  first  step  of security privacy and accessibility need to be
getting  data  into the  data  warehouse considered carefully .Accordingly web enablement
environment. Extracting means reading andfacilities should be ensured in the software tools
understanding the  source data,  and  copying used for data warehouse development.
the pas  that are needed to the data staging for  
further work.Stages in Implementation
- TransformationA data warehouse cannot be purchased and
Once  the  data  is  extracted  into  the installed. Its implementation requires the integration
data  staging  area,  there  are  many of implementation of many products. Following are
transformation steps, includingthe steps of the Data Warehouse implementation:
 Step 1: Collect and analyze business requirement.
1.  Cleaning the data by correcting misspellings,Step2: Create a data model and physical design
resolving domain conflicts, dealing withand data warehouse after deciding the
        missing data elements, and parsing               
into standard formats.            appropriate hardware platform.
2.  Purging selected fields from the legacy dataStep 3: Define the data sources
that are not useful for data warehouse.Step 4: Choose the DBMS and software platform
3.  Combining  data  sources  by  matching for data warehouse.
exactly  on  key  values  or  by Step 5: Extract the data from operational data
performing  fuzzy    matches on non-key sources, translate it, clean-up and load into the
attributes.     
4.  Creating  surrogate  keys  for  each             data warehouse model or data
dimension  record  in  order  to  avoid mart.
dependency  on legacy  defined  keys,Step 6: Choose database access and reporting
 where  the  surrogate  key  generation tools.
process  enforces  referential integrity betweenStep 7: Choose database connectivity software.
the dimension tables and fact tables.Step 8: Choose the data analysis (OLAP) and
5.  Building the aggregates for boosting thepresentation (client GUI) software.
performance of common queries.Step 9: Keep refreshing the data warehouse
- Loading and indexingperiodically.
At  the  end  of  transformation  process,   
the  data  is  in  the  form  of  load Access Tools
record  images. Loading  in  the  data With the exception of SAS(of SAS institute), all
warehouse  environment  usually  takes  the the Data Warehouses /OLAP vendors are not
form  of  replicating  the dimensional  tables currently providing comprehensive single-window
and  fact  tables  and  presenting  these software tools capable of handling all aspects of
tables  to  bulk  loading facilitates each data warehousing project implementation .SAS
recipient  data mart.  Bulk  loading  is a veryalone meets the requirement largely independently
important  capability  that  is to  be as it has its own database internally with a
contrasted  with  record-at-a  time  loading, capability of import data from any vendor DBMS
which  is  far  slower.  The target data martsoftware. Therefore one can implement a data
must then index the newly arrived data for querywarehousing and data mining solution
performance.independently with SAS.
 The best way to choose a group of tools is to
Data Martunderstand the capability and compatibility of
Data  mart  is  a  logical  subset  of  an different type of access to the data and
enterprise-wide  data  warehouse.  Forreporting by selecting best tool in market for that
example, a data warehouse for a retail chain iskind of access. The types of access and reporting
constructed incrementally from individual,are as follows:
conformed data marts dealing with separate- 1. Time series analysis
subject areas such as product sales. Dimensional - 2. Data visualization, graphing, charting and
data  marts  are  organized  by  subject pivoting
area  such  as  sales,  finance,  and - 3. Complex textual search (text mining)
marketing  and  coordinated  by  data - 4. General stastical analysis.
category  such  as  customer,  product, - 5. Artificial intelligence techniques for hypothesis
and  location. These  flexible  information testing, trends discovery, identification and
stores  allows  data  structures  to validation of data clusters and segments(also
respond  to  business  changes-product  line useful for data mining)
additions,  new  staff  responsibilities, - 6. Mapping of specifial information into geographic
mergers,  consolidations,  and acquisitions.information system
  - 7. Ad hoc user-specific queries
Data Warehouse Database- 8. Predefined repeatable queries
A data  warehouse database  contains  the - 9. Drilling down interactically
data  that  is  organized  and  stored - 10. Reporting the analysis by drilling down
specifically  for direct  user  queries  and - 11. Complex queries with multi-table forces,
reports.  It  differs  from  an  OLTP multilevel sub-queries, sophisticated search criteria.
database  in  the  sense  that  it  isdesignedIn some applications, the user requirement may
primarily for reads not writes. An  OLAP exceed the capability of tools. A number of query
application  is  a  system  designed  for tools are available in the market today which
few  but  complex  (read  only)  request. enables an ordinary user to build customized
An OLTP  application  is  a  system reports by easily composing and executing ad hoc
designed  for  many  but  simple queries without any necessity to have the
concurrent  (and  updating) requests.knowledge of the underlying design details or data
 base technology, SQL, or even the data model
Metadata  
Metadata defines the content and location of the5.  Its Applications in Agriculture
data in the data warehouse, relationships between  
the operational databases and the dataProject: Agriculture Information System Network
warehouse and the business views of the data in(AGRISNET)
the data in the warehouse as accessible to theDepartment of Agriculture and Cooperation (DAC)
end-user tools. Metadata is searched by user to[2] have taken steps to establish "Agricultural
find the subject areas and the definitions of theInformation System Network (AGRISNET)" in
data.collaboration with NIC. The Proposal recommends
For decision support, the pointers required to data(i) the state-of-the-art IT infrastructure
warehouse are provided by the metadata.requirements to establish AGRISNET as the
Therefore, it acts as logical link between theINTRANET over NICNET, (ii) development of
decision support system application and the datadatabases and information systems for decision
warehouse. Thus, any data warehouse designsupport for evaluation, monitoring and policy
should assure that there is a mechanism thatformulations, and (iii) human resources
populates and maintains the metadata repositorydevelopment, (iv) multi-media based training and
and that all access paths to data warehouse havedemonstration of transfer of technology to
metadata as an entry point. In other words therestrengthen Farm Research and Education using
should be no direct access permitted to thebroadcast VSATs, (v) special interest groups in
data-warehouse data if it does the user metadatarespect of subjects, problems, programmes,
definitions to gain the access. Meta data definitionschemes, etc, and above all, to make Indian
can be done by the user in any given dataAgriculture on-line for INTERNET and INTRANET
warehousing environment. The softwareaccess through AGRISNET Nodes. AGRISNET
environment as decided by the software toolsNodes are envisaged to be established at 
used will provide a facility for metadata definition  • DAC Hqrs (Krishi Bhawan), 
in a metadata repository.  • DAC Attached Offices and its regional
 offices, 
OLAP Vs OLTP  • DAC Subordinate Offices and its regional
 units, 
OLTP (Online Transactional Processing)  • DAC Public Sector Undertakings
- OLTP servers handle mission-critical production(NSC&SFCI) and sub-units, 
data accessed through simple queries  • DAC Autonomous Organizations, 
- Usually handles queries of an automated nature  • Apex Cooperative Organizations 
- OLTP applications consist of a large number of  • State Agriculture Departments 
relatively simple transactions.  • NCT/UT Agriculture Departments 
- Most often contains data organised on the basis  • District Agriculture Offices and 
of logical relations between normalised tables  • Block Agriculture offices 
• OLAP (Online Analytical Processing)In this direction, IFFCO has taken up a project in
- OLAP servers handle management-critical dataassociation with Indian Space Research
accessed through an iterative analyticalOrganisation (ISRO) to utilise satellite based
investigationremote sensing data and Geographical Information
- Usually handles queries of an ad-hoc natureSystems (GIS). Attention may be drawn to the
- supports more complex and demandingfact that the developed countries have been
transactionsutilising precision farming with the help of IT tools
- contains logically organised data in multiplefor a long time. While this will take a long time for
dimensionsour country due to small holdings, it is to be noted
 that GIS has an invaluable role to play even in the
2.    Warehouse Schema Designexisting conditions. Remote sensing and GIS
Dimensional modeling is a term used to refer ainformation can provide warnings on evolving crop
set of data modeling techniques that havegainedstresses, crop vigour, etc.
popularity  and acceptance for  data The IFFCO-ISRO GIS project extends support
warehouse  implementation.  Dimensionalfor efficient and timely availability of IFFCO's
modeling is one of the key techniques in datafertiliser to farmers though better logistics &
warehousing.  Two types of tables are used inefficient operations. It endeavours to provide
dimensional modeling: Fact tables and dimensionalfarmers' advisory services to provide decision
tablessupport to farmers on land related issues,
 weather, etc. In addition to the GIS based
 services, effort is being made to create
 databases that contain information of interest to
Fact Tables  the farmers. These include recommendation on
These are used to record actual facts andpackage of practices for major cereals, pulses,
measures in the business.  Facts are numerichorticulture, floriculture and animal husbandry,
data items that are of interest to the business. etc.Information on all the inputs such as seeds,
Example, telecommunication- length of call infertiliser, sources, current availability, prices,
minutes, average number of calls.availability of credit, alternatives available and
 terms and conditions, etc. are sought to be
Dimensional Tables provided. An important service envisaged is to
Dimensional tables establish the context of theprovide access to the nearest expert in case of
facts.  Dimensional tables store fields thatstress or any other problem witnessed in the
describe the facts.  Example, telecommunication-crops. Facilities are sought to be provided to
call origin, call destination.  A schema is a factencourage and share farm experiences by forging
table plus its related dimensional table.various crop forums. Many of the agricultural
  extension services are also proposed to be made
3. Crucial Decision in Designing a Data Warehouseonline using aspects of multimedia.
The job of designing and implementing a dataIn order to encourage farmers to obtain best
warehouse [3] is very challenging and difficult one,possible price, information on various agricultural
even though at the same time, there is a lot ofoutput markets (mantis) is also being provided.
focus and importance attached to it. The designerThe objective of this activity is to provide status
of the data warehouse may be asked by the topof price at different mandies to facilitate farmer
management:"take all enterprise data and build ato move his produce to the mandi where he can
data warehouse such that the management canexpect better price. Other areas of interest to
get answer to all their questions". This is dauntingfarmers such as distance education, location
task with responsibility being visible and exciting.specific news, etc. are also planned. Access to
But how to get started? Where to start? Whichother related sites of interest such those relating
data should be put first? Where is that datato courts, health, etc. are also sought to be
available? Which query should be answered? Howprovided.
would bring down the scope of project to  
something smaller and manageable, yet be6.  Conclusions
scalable to gradually upgrade to upgrade toAnalytical exploration of vast amount of
comprehensive data warehouse environmentagricultural data can best be support by
finally?appropriate application of Data Warehousing and
The recent trend is to build data marts for beforeOLAP technologies. A Data Warehouse provides
a real large data warehouse is built. People wantefficient and reliable structure of storage for vast
something smaller, so as to get manageableamount data while OLAP techniques provide
results before proceeding to the real datamechanisms for analysis of this data.
warehouse. 
RALPH KIMBALL identified a nine step method as7.  References
follows:[1] Data warehouse and its applications in
Step 1: Choose the subject matter.Agriculture, Anil Rai, Indian Agricultural Statistics
Step 2: Decide the what the fact table represents.Research Institute Library Avenue, New Delhi.
Step 3: Identify and confirm the dimension.[2] Information Technology in Agriculture, S.C.
Step 4: Choose the facts.Mittal.
Step 5: Store precalculation in the fact table.[3] Data Warehousing concepts, Techniques,
Step 6: Define the dimension and tables.Products and Applications, C.S.R.Prabhu.