. Updated Daily. Editions SDA India   SDA Indonesia
JAX Asia 2008 - Conference for Enterprise Java, SOA, Spring, Web Services, Ajax, Agile and more
BUSINESS ENTERPRISE SOLUTIONS ARCHITECTURE INFORMATION SECURITY WIRELESS & MOBILITY DATA & STORAGE DEVELOPMENT HARDWARE













Online Articles

 

A Perspective on Reporting in J2EE


By Abhijit Belapurkar

 

I was recently involved in architecting a complex J2EE application with strict requirements about flexible and fast data reporting. While doing the groundwork for this project, I realised that reporting and its unique considerations are rarely given their due importance in forums and books on J2EE architecture and design. This article provides a high-level overview of what reporting entails in the J2EE context, and describe the factors which must be kept in mind when deciding on how best to integrate reporting into your overall application architecture.

 

 

A modern-day enterprise has operational data in multiple disparate sources such as transactional databases, data warehouses and file systems. This data must be made available to those who need the information to run their daily business. Reporting is required in an enterprise to provide this information to relevant people in real-time; the actual reports themselves must be comprehensive and presented in a format that is familiar and intuitive to the target users.

Traditional reporting solutions were built around the ability to easily provide formatted output from queries against relational databases. In fact, many solutions in the market in those days were actually provided by the RDBMS vendors themselves. These solutions, though useful and popular in their own way, may not be the right fit for seamless integration with the complex distributed applications that manage enterprise data today.

 

With more and more distributed applications being built on the J2EE platform, the need has been underlined anew for having an enterprise-ready, high-performing and scalable reporting solution that can seamlessly integrate with the J2EE applications portfolio of the organisation. In this article, I provide a high-level overview of the multiple facets associated with incorporating reporting functionality within J2EE applications.

 

The Basics of Reporting

Simply defined, reporting is the process of accessing, formatting, and delivering data as information to be a viewed and analysed by users within and outside the organisation. Reporting may belong to one of two categories: static (i.e., operational) or dynamic (i.e., active).

 

Static Reporting, widely used across the enterprise, includes generating reports based on static report templates mainly for delivering transactional information, such as audit reports, user account activity reports.

 

Dynamic Reporting, typically used in the business intelligence marketplace, allows end users to customise reports as per their specific requirements, with respect to the representation of data. Such reporting aims at offering the end-user scope to analyse an existing report. This customisation could include selecting only required fields from a report, sorting/filtering data on the fly, changing the appearance of the charts/graphs and drilling down data to get detailed and aggregate information.

 

Solutions to this requirement ranges from the least appealing approach of hand-coding reporting functionality into your application (although this might not be entirely imprudent, as we will see in the subsequent sections) to using fancy ready-made reporting engines that come with a variety of features.

 

Before going further, let us briefly look at the steps involved during report generation. A given reporting solution may not follow all the steps as-is (or even in the same order) but these steps do occur in some form or other, and this should provide a very high-level view of what goes into making a report.

 

Broadly, a report is generated when a report design (or template) is interpreted through a report engine for a given set of runtime parameters, data sources and output view format. When a report is generated (or executed), the report engine first compiles (or interprets) the report design. Following this, it collates report data from a variety of sources (as indicated in the report design or by the report requestor, as the case may be), applies the runtime information sent with the request and finally formats the end-result in the specified view format.

 

Let us now look at the possible architectures that a reporting solution can be based on.

 

External Versus Embedded Reporting

An “External Reporting” solution comprises of an exclusive reporting server that completely manages reporting requests and reports. Applications make their reporting requests by calling out to these servers. Such model requires additional planning (apart from planning for the application infrastructure) in terms of the deployment strategy and on-going maintenance of the reporting server.

 




Fig. 1: External reporting architecture

 

In the context of J2EE-based applications that are already deployed in an application server infrastructure (where long hours of planning are devoted to the deployment strategy of the server), building the reporting functionality into the enterprise application (via embedded reporting API tools) enables the application to leverage the existing infrastructure capabilities. Additionally, this model (see Fig. 2 for architecture) allows for reuse of existing application modules (e.g., reuse of the data layer of the application that collates information from several external systems and data sources, and possibly applies business logic on it to massage the information) whilst generating reports.

 

Many commercial reporting solutions have switched from a client-server (external) model to provide embedded reporting APIs that expose the reporting engine’s operations, with additional features to cater to web-based deployments. They offer high-level integration within a J2EE environment, in that they easily tie into the application’s infrastructure layers such as the application server, data layers and in some cases even the Integrated Development Environment (IDE).

 




Fig. 2: Architecture for embedded reporting

 

Reporting in an SOA World

The new, cool buzz-word “Service-oriented Architecture” or SOA introduces another twist in our enterprise reporting tale. SOA is an architectural style that advocates loose-coupling between providers and consumers of services (a service essentially being a unit of work that achieves the desired functional end-result). SOA provides a façade or a single point of contact that hides the business object model and underlying technical implementation details of the service provider from the consumer.

 

In this context, how do you deal with a reporting solution in a service-oriented architecture? One possibility is that “reporting” could be implemented and exposed to consumers like other service (which, internally and unbeknownst to the consumer) may use a reporting engine to generate the actual reports. The exposed service would simply accept a request for a specific report, along with runtime parameters and the delivery method (browser/PDA, email, file-store etc.). In turn, this wrapped engine could contact the data sources, gather relevant data and generate reports from it. A more sophisticated design would use the business services interface layer to get “business objects” (instead of raw data from the respective sources) and generate the reports from the objects instead. This ensures that business logic does not have to be re-implemented in the reporting engine to massage the raw data into a form suitable for reports generation.

 

Deciding on a Reporting Solution

As previously mentioned, there is a diverse set of options available – should you hand-code basic reporting as part of your application’s functionality or look at a ready-made option? In the latter case, should you go for a commercial or open source offering? How should you go about deciding which one best suits your needs?

 

The following is a basic list of things to consider and account for, depending on your specific context and use the information thus gleaned to arrive at a decision.

 

How Complex are Your Reporting Requirements?

This section gears towards trying to decide if the problem at hand qualifies for deploying a full fledged (COTS or Open Source) reporting solution as against manually developing the reporting functionality.

 

The primary consideration for whether or not to manually build reports into the enterprise application should be the amount of estimated effort required for developing, testing and maintaining this custom functionality. If end users demand a large number of reports (with complex look-and-feel) that keep changing with shifting information needs, attempting to custom develop the whole functionality is certain to drain resources away from other pressing tasks. Unless a lot of care is exercised, this may also lead to code that is spread out across components, poorly documented and very difficult to maintain.

 

This has to be balanced against the fact that commercial reporting solutions do not come cheap and may be impractical for enterprise applications developed on tight budgets. Hence, if there are only a handful of reports that are simple to begin with and unlikely to change in the future, hand-coding the reporting functionality may not be bad after all.

 

A few more sample scenarios and their alternative solutions are mentioned below:

 

When the reporting needs are entirely visual (meant primarily for online viewing) – a few reports involving simple data representation formats that rarely change, JSPs could easily address the situation.

 

Cases where reports need to be generated offline or scheduled at a certain time of the day since the report data itself is collated only at that particular time of the day (e.g. user transaction audit reports are typically fired at day’s end) need to be addressed differently. If the focus is on offline report generation in a certain format, say PDF format (which can be later circulated for review), one addresses this by generating XSL-FO which is then processed by a XSL style sheet and before passing through Apache FOP to produce PDFs. This approach, however, might not be appropriate for large multi-page reports where the FOP would create a DOM tree of the XSL-FO (a fairly verbose XML format) using vast amount of memory to create the PDF reports, while using reporting tools (embedded or external) might prove effective.

 

Thus, the decisions to make towards whether to opt for a reporting solution, and if so, the optimum one requires narrowing down of specific business and corresponding technical requirements, and gauge how the available reporting solutions (or alternative homegrown ones) address these requirements.

 

Should the Reporting Engine be External or Embedded?

As mentioned earlier, an “external” reporting engine runs as a separate server, probably on a separate machine. This implies that additional work must be done relating to capacity planning, deployment and on-going maintenance of this server. In addition, the reports server is required to perform data access without being able to reuse the data access logic built into the application (as well as the business logic that may be required to massage the raw data into ready-to-use information).

 

However, many reporting solutions are geared to cache commonly requested static reports. Since the memory for the cache is taken from the JVM heap, having a separate server does not make any demands on the application’s memory requirements.

 

Therefore, applications requiring intensive reporting, report version management and administration definitely scale well with an external reporting option application.

 

How Flexible is the Report Designer?

A report is represented as a composition of structural components such as data controls, charts and grids etc., as well as data access items such as simple or parameterised data source queries. Such representation is referred to as the report design or template. The template vocabulary is typically specific to a reporting engine, thereby making it non-interoperable across reporting engines. This blueprint of the report is compiled and interpreted when the report is executed.

 

Most available reporting tools such as JReport, Formula One Reporting etc. allow adding and removing of visual components and dimensions using simple and intuitive drag-and-drop web designer interfaces. These also include wizards that guide you through common tasks such as connecting to the data source, linking data tables, selecting fields and records, grouping, sorting, summarising, and formatting.

 

In complex report layouts/representations, a report designer that allows you to preview the report with sample data has an edge over one that prohibits you from predicting the report appearance until you actually run the report via the engine.

 

Which Data Sources does the Engine Support?

Another concern while choosing a reporting solution is whether it supports your existing data sources. It needs to be adept at establishing relationships with multiple data sources, including different data source types – JDBC, ODBC, XML, etc. and offer APIs to access user-defined data sources from applications.

 

Windward Reports, Formula One e-Reporting etc. are capable of generating reports based on the data structure inherent in hierarchical data sources like XML streams. This might be of interest if the data feed is expected from an external system.

 

Engines such as Formula One e-Reporting, which accept feed from in-memory objects and result sets are useful in case your application has an existing data layer for collating data from varied data sources. Otherwise, this would burden the task of data collation on the application while using the engine merely as a report formatter. ReportMill also makes use of business objects that are already created and populated in the application, with the advantage that all the business logic that went into creating these objects from raw result sets read from the database is implicitly reused.

 

JasperReports is an open source reporting solution that abstracts the underlying reporting data sources by providing a ‘DataSource’ interface. One can implement this interface as holders of tabular data come from a variety of different sources including EJBs.

 

Needless to say, if you plan to collate data from more than one data source for a report design, you must choose an engine that supports multiple data sources within one report design template.

 

What are the Supported Report Formats and Delivery Modes?

For visual reporting (reports for display), one should look for reporting tools that provide some form of web markup language, HTML or XML, and almost any engine worth its salt provides this. With most J2EE reporting tools, reports can be easily packaged or published as web components (leveraging the existing application server or portal server environment) for viewing via a web browser or can be custom-viewed via report viewers (typically in case of actionable reports).

 

Nearly all tools support report output generation in PDF and other formats. However, if report archiving and versioning is a concern, we might want to go with an engine that supports these features as well (unless there is an existing document management system in place which we might want to integrate with).

 

A commonly desired format these days might be delivery over email or directly to PDAs or cell-phones. Many commercial engines provide full support for this. In general, the more flexibility the reporting engine provides in terms of formatting options supported, the easier it is to explore new avenues for preparing and displaying custom reports on non-traditional computing devices.

 

Does the Engine Meet Your Performance Considerations?

Any enterprise solution should address your current and future business needs by satisfying the response times and be able to scale to any workload. The performance of reporting within a J2EE environment is not very different from the performance-related viewpoints of any J2EE application.

 

Typical causes of performance problems for reporting applications include improper application design, lack of database tuning, network bottlenecks, undersized infrastructure (hardware/web and application server) and improper infrastructure configuration.

 

Lest you have RDBMS-based reporting data source, erroneous configurations such as insufficient indexing and fragmented databases could cause serious database bottlenecks. If your application design is such that you feed data to the engine (via result sets or other in-memory objects), design errors such as the lack of or poor data cache management mechanisms, flawed database connection pooling and non-optimised database queries could impact data retrieval. Alternatively, if the engine is doing the data gathering job, it is imperative to find out if the above-mentioned aspects have been taken care of.

 

On a related note, in case your reporting requirement is data intensive and does not involve online viewing, you might want to consider scheduled reporting. This would avoid impacting the response times to users working on other applications (or even the same application) that share the same application data, owing to any locks that the report engine (by means of your database queries) might hold during data retrieval.

 

Benchmarks brochures are available for commercial reporting engine which can be used as a base to decide if they can match your performance requirements. Simultaneously, it is essential that these brochures should not blindly dictate your choice. This is because the numbers could be off the mark if your integration design (based on your requirement) varies from what the vendors have proposed. The safer bet would be to study the response time by evaluating different reporting products with reference to your design prototype. Finally, when you have integrated a reporting solution with your application, plan for automated performance tests for your system and implement them iteratively to identify and remove bottlenecks.

 

What Kind of Security Options does the Engine Provide?

It is crucial for the right information to be made available to the right people and at the right time. Therefore, any reporting solution under consideration must be capable of securely delivering authorised content to authenticated users. The reporting solution may assume that the functionality to authenticate the users is implemented outside its scope and applies access control to the data content for the authenticated user.

 

There are multiple ways which different end-users can be presented with relevant and authorised information, as described below.

 

The simplest mechanism is essentially a parameterised execution of the report queries against the database. That is, the same report design is reused for all users. The design, however, includes an abstract query with parameters that get instantiated at runtime with user-specific values that control the scope of information that is finally returned by the execution of that query against the database. This option is expensive since the query must be executed once per every user (or for every group of users that share the same parameter values) and the report constructed and formatted from the data resulting from that query.

 

Another technique is “report bursting”, in which a single query is executed to get all pertinent data from the repository. Following this, a dedicated component of the reporting engine can read the contents of this enterprise report looking for user-defined “descriptors” (which may include lines, locations, titles, keywords etc.) to identify the appropriate pages of a report to separate (or “burst out”). All pages of the main report that match the defined descriptors are separated and spooled as a sub-report that can be delivered separately when and where required. ACLs can also be assigned to pages and later used as the criteria to decide whether a particular page should be displayed to a specific user or not (depending on whether or not the user ID, or any other defined user attribute, is contained in the ACL associated with that page).

 

A third alternative could be depending on the metadata security functionality provided by the repository itself. This is essentially akin to a user firing an on-demand query to the database with the authenticated user’s security credentials being sent with the request so that the database engine can appropriately restrict the data sent back in response. Such implementations are however prone to suffer from unpredictable server workload leading to performance losses.

 

Good reporting engines should integrate seamlessly with the security infrastructure already existing in the enterprise and leverage the security context set up for the user who is using the enterprise application(s) into which the reporting functionality is embedded.

 

Conclusion

Information plays a crucial role in successfully running a modern enterprise. Hence, it is not surprising that decision makers within any organisation expect to have access to a reliable vehicle for information delivery that allows them to slice and dice the data in their business systems, view it in multiple (possibly dynamic) formats and mine that data for useful business information.

 

In this article, I have provided a high-level overview of the concerns involved in enterprise reporting, and the choices available to address those concerns. I also summed up with the guidelines on making the right choice and planning the optimum integration design so as to avoid significant impact on performance and maintainability while meeting the organisation’s information needs.

 

 

Abhijit Belapurkar has a bachelor of technology degree in computer science from the Indian Institute of Technology (IIT), Delhi, India. He has worked in the areas of architectures and information security for distributed applications for almost 10 years and has used the Java platform to build n-tier applications for over five years. He presently works as a senior technical architect in the J2EE space, with Infosys Technologies Limited, Bangalore, India.

 

Links & Literature

Some popular commercial or open source reporting solutions in the market include:

 

·    Formula One e.Report from ReportingEngines, a division of Actuate: http://www.reportingengines.com/index.jsp

·    JReport from Jinfonet Software: http://www.jinfonet.com/

·    ReportMill’s Object Reporting: http://www.reportmill.com/

·    Crystal Reports from Business Objects: http://www.businessobjects.com/

·    Style Report from InetSoft Technology: http://www.inetsoft.com/inetsoft/index.html

·    JasperReports: http://jasperreports.sourceforge.net/

·    Winward Reports: http://www.windwardreports.com/

 
print save email comment

print

save

email

comment

 
 

Search SDA Asia

Free eNewsletter

SDA Asia Magazine Free Download
 
 
 
Copyright @ 2008 SDA Asia Magazine - All Right Reserved Privacy Policy | Terms of Use