In this paper we first briefly describe a data warehouse concept in general. We discuss the benefits of using it, trying to point out that data warehousing is important, not only in business enterprises, but in the university environment as well. Next we present a case study of using a data warehouse to enhance an existing information system, which handles admissions to university. Activities that are carried out within the admission process insert and use data from the operational database, which is then re-created every scholastic year. Instead of linking all databases together to perform a complex analysis, we could use a data warehouse, loaded with complete data set. Many advantages could be achieved with such an approach.
A data warehouse is a subject-oriented, integrated, time-variant, and non-volatile collection of data in support of management's decision-making [1]. Simply, a data warehouse is repository of data that is specifically designed to make analysis of data simple and efficient. It is being developed to provide end-users with a collection of administrative and business information from the organisation's existing operational systems.
A data warehouse is different from an operational system because:
Today many organisations are trying to develop a data warehouse to meet their business objectives.
There are many benefits of data warehousing. The most important are listed below.
In the highly aggressive market, use of these techniques could be of great importance [2].
However, not only business enterprises can benefit from data warehousing. This concept of collecting data into separate, multidimensional repositories to handle complex decision making-activities, can be used in the university environment as well. Note that traditional administrative systems, such as Student information system (student's records, entrance examination, admission, examination records etc.) are designed to optimise transactional processing. However, data synthesis and analysis is rather difficult in such systems, causing low performance of operational databases.
By using a data warehouse we could overcome this kind of problems. Not only that we set operational data free from complex retrieval, there are also many other benefits, concerning the university management.
For example, consider these few questions, which could be answered easily, using a data warehouse:
Note that those questions are not very easy to answer, using relational databases, due to a simple fact: relational databases are good at retrieving small number of records quickly, but they do not retrieve a large number of records and summarise them on the fly [3].
In the next section we present a study case of using a data warehouse to enhance an existing information system, called VPIS [4]. VPIS is an admission-system that handles admissions to university study programmes. Basically it supports an algorithm which accepts or rejects applicants, according to their selected programmes, qualification rules, secondary-school grades, final exams etc. The fact that applicants are allowed to apply to three different study programmes makes this selecting algorithm very complicated.
There are mainly two reasons, which lead us into development of a data warehouse:
Candidates who intend to study at the university level in Slovenia apply to the admission service. In the application form they name up to three different study programmes, choosing from several faculties and departments. However, they have to be careful, because the priority in which they name study programmes could be very important. As we will see latter, this is due to different types of study programmes that influence the selection algorithm.
The admission process begins with a data capture from the application forms into a database. Once this is finished, the analysis is undertaken to find out how many times a specific study programme was chosen as the preferred option. Depending on this number, it could become either a limited or an unlimited type. The limited programmes are those with more applications then study places available. All others are unlimited programmes. In the next step additional activities are performed, including programme-specific testing, calculation of marks etc.
The core of the admission process is a selection algorithm [5] that tries to place the candidate on his preferred study programme, for which he fulfils all the qualifications and other required conditions. At that point the priority, and type of selected study programmes become important. Depending on this, candidates could be accepted directly, without any competition, or they have to compete with other candidates. Rules defining this decision process are stated below.
You can see the selection algorithm concerning unlimited/limited programmes in figures 1 and 2. Although the difference is obvious, we give you an example that explains the difference even more clearly.
Example:
Think of a candidate that would like to study most at the:
Suppose now that he has no preferred order between these three study programmes. Because he is unable to figure out which study programmes may become limited, he names them in random order. After the application analysis, it turns out that the physics study programme is the only one that is unlimited. Now observe what could happen.
In the selection algorithm the candidate is rejected, because of strong competition, both for the law study programme and for the computer science and informatics study programme. He is left with only the physics study programme. Although the programme is unlimited, he must still compete for the available places. It is important to note that in this case the available places are only those that were left by the applicants, which chose the programme of physics as the preferred programme. Unfortunately, he is rejected for that programme, too. In fact it turns out that he is not accepted anywhere.
A completely different scenario would happen if he selected the physics study programme as his preferred study programme. In that case, he would be accepted without any selection or competition.
The selection algorithm compares the totals of points that candidates get according to their success in the last two years of secondary school, final examinations and study programme specific testing. Calculations of totals are performed according to the formulas that are defined for every single study programme. Below you can see the example of a formula.
Example:
Department: Faculty of computer and information sciences.
Study programme: Computer science and informatics.
Qualification conditions:
a) Matura exam or
b) Internal final exam
In case of a limitation, the total of points must be calculated using this formula:
a) for candidates with the Matura exam: Total = 0,60(Sb) + 0,40(S34)
b) for candidates with the Internal final exam: Total = 0,20(SIf) + 0,40(S34) + 0,40(Sadd)
Legend: | S_b | number of points achieved by Matura exam |
S_34 | number of points achieved by success in last two years of secondary school | |
S_If | number of points achieved by Internal final exam | |
S_add | number of points achieved by mathematics and physics tests. |
Note that candidates do not have equal possibilities to be accepted by any faculty. Study programmes like medicine and law, always have very strong competition. Hence, only candidates with very high totals can compete for those study places. Others should avoid naming a study programme of medicine or law in the application form. For that reason, it is important to know how to evaluate the possibility of acceptance for the specific study programme. Our former experience shows that not many candidates have the knowledge how to do that. The fact is that candidates do not know, at the time when they are applying, neither how successful they are going to be at the final examinations or at specific testing, nor which study programmes will become limited. They can only guess.
However, there is something that we could help them with. These are results of past admissions. If the candidate, from the example above, knew that the physics study programme was almost always unlimited, then he would certainly choose it, as the most preferred study programme. He would also avoid choosing the law study programme, if he knew how many points he needed for that programme (supposing that he was not one of the best students, either at the secondary school, or at the finals).
As we stated earlier, candidates that apply for university study programmes would make fewer mistakes if they were provided with enough information. We think that publishing this information on the Internet would certainly make sense. Due to deep analysis that is necessary to get this kind of information, usage of a data warehouse would be highly recommended.
Here is an example of the historical information about the law study programme. We are sure applicants would find this information useful.
However, an even more exciting idea is to develop a tool that would be able to answer the questions like:
Of course, answers would be based on historical information. Although they were only estimations, we are sure they would be very useful.
We have shown that there is no particular reason why a data warehouse could not be used in a university environment. In fact, there are several features of data warehousing that could be beneficial, both for the university management and for applicants and students. However, the most important principle in data warehousing is that the project should be driven by a need to produce results for the end user and not by a desire to create a sophisticated IT system [6]. We believe that we have shown that too.
Marko Bajec, Rok Rupnik, Marjan Krisper
Trzaska 25, 1000 Ljubljana, Slovenia
e-mail:
marko.bajec@fri.uni-lj.si,
rok.rupnik@fri.uni-lj.si,
marjan.krisper@fri.uni-lj.si