Using Informatics to Manage and Measure Performance of Large Femtocell Networks

This paper describes the analytics platform and algorithms that are used to manage network performance of large femtocell networks with close to one million femtocells from multiple vendors. The system incorporates parallel processing of the data, rule based artificial intelligence, and large scale automated data analysis and data mining.


Introduction
Femtocell technology is now at the point where it has advanced from the concept and research phase [1]- [5] to large operational networks with millions of units deployed worldwide.Residential and enterprise femtocells, also called small cells, are viewed by operators as a solution for providing high quality indoor coverage and capacity to their customers who are not located in close proximity to macrocells.Residential femtocells generally support between 6 and 12 simultaneous connections, whereas enterprise type small cells can support between 16 and 128 simultaneous connections, and approach the performance of picocells.Managing a large femtocell network requires that the performance of the network as a whole and the individual femtocells be accurately and efficiently measured, and that actionable information is generated and fed back to the network management system.The problem is how to do this with potentially millions of femtocells reporting daily.
Managing and monitoring a network approaching one million units is not feasible using standard network management analytics technology used to manage macrocellular networks of a few thousand base stations [6] [7].A new approach that can scale to potentially one million or more femtocells is required.Femtocells must sa-tisfy or exceed the same radio performance targets that are required of conventional macrocellular base stations in terms of standard Key Performance Indicators (KPI's) that attempt to measure user experience.
Residential femtocells use the customers' home broadband Internet connection to backhaul to the operator's core network and are usually managed and provisioned using Broadband Forum standard TR-069 protocols and servers that are originally developed for the cable/DSL market.They are deployed in a random ad-hoc manner like Wi-Fi systems.Finally, as consumer devices, they must be absolutely plug-and-play and very low cost, and all interactions between the network management system and the individual femtocells must be transparent to the home user.
A mobile cellular operator's RF and network support teams cannot check the performance of every femtocell in a one million node network each day, or even respond to more than a few tens of trouble tickets each week.The challenge facing wireless network operators who deploy large numbers of femtocells is how to manage a large number of femtocells with a minimal staff through the use of informatics and analytics.
This paper describes in detail how performance data from several large deployed femtocell networks totaling close to one million units is collected, analyzed and then used to manage real deployed femtocell networks.The femtocell network performance management system described in this paper is based on large scale automation of the complete analysis process from collection to presentation, employing parallel processing of the data, rule-based artificial intelligence, and data mining.
The top-level requirements for the femtocell network performance management system are: • Identify and automatically take corrective action on individual femtocells that do not meet network KPI performance thresholds, or that have not successfully completed provisioning or plug-and-play activation.• Collect and merge data from other network elements that comprise the femtocell solution to create for the network management team a complete picture of network performance and overall user experience.• Present the femtocell network performance data in ways that can be effectively used by multiple tiers of network and customer support teams.

Understanding the Femtocell Network Data Model
The data model for the femtocell management network illustrated in Figure 1 consists of: Operational Measurements (OM's): OM's are individual femtocell level counter information such as: call attempts, failures, call drops, estimated data rate, data sent, etc. OM data is usually collected hourly at the femtocell and randomly uploaded to a storage server once per day.Every day, several hundred individual metrics are collected from each femtocell in a comma-delimited, compressed file.Different femtocell vendors in a multi-vendor network and each product class (enterprise, residential, LTE, CDMA) have a different set of OM's.Individual OM's can be cumulative, instantaneous snapshots, averages, or minimum and maximum values.
Connection Records: Most macro networks create some form of Call Record (CR) for each voice and data connection.Due to the volume of data that needs to be moved from the femtocell to the core, most femtocell networks do not download or collect call records for each call event.Some femtocells collect records only for call failures, while some collect call record information, but only upload records on demand.The triggering of the upload of call records can be either automatic or manual.
Debug Logs: Femtocells generally have some kind of logging framework for debugging.The information collected in debug logs rolls over every few hours based on activity level.Debug logs can be uploaded upon demand, but normally just continue to roll over.Higher resolution logging of various components can be enabled but causes the log files to roll over significantly faster.Again, the transfer can be manually or automatically triggered for only a small number of femtocells.Femtocell Configuration Record: A snapshot of the current configuration of each femtocell is produced once per day: Information in the channel configuration record includes: Channel assignments, (voice and data), beacon channels for attracting the handset to the femtocell (voice and data), transmitter power for voice, data and beacon channels, latitude and longitude of the unit, status of the femtocell including radio status, neighbor lists, last contact time, other RF parameters, and software version, debug version, product class, and error codes.
Alarm Data: Alarms are event-driven records that are generated when something happens such as a sector going down, etc. versus OM's that are a series of counters collected at regular intervals.Alarm data is uploaded to the management system and correlated for data mining.Large numbers of alarms are created from one million femtocell nodes, and automated processing to determine when alarms are correlated is an important function of the management system.

Information Analysis Algorithms
To be able to process all the data in a timely manner each day, the architecture of the network performance management system incorporates a high degree of parallelism in the overall processing of the data.Parallelism is introduced in the algorithm by designing the data processing workflow to split the work into processing units that can be processed in a standalone manner.At the end of each processing step, the data from each of the parallel tasks is merged together.In each subsequent processing step, the remaining data is then broken down again into new parallel processing units.This "divide and merge" algorithm is incorporated throughout the system.The overall architecture is based on the use of metafiles that define the types of OM's and the processing to do on each OM so that different forms of OM files and different versions can be supported.Examples of different types of OM processing include: Cumulative OM's, hourly sums, averages, minimum value, maximum value, and most recent value.The flow of data through the performance management system is illustrated in Figure 1 and described in the following paragraphs.
Process 1: Collect the Data: The OM data, alarms, configuration records and other data are moved from regional storage servers to the central Processing Server.For a network with about 700,000 femtocell nodes, the daily data set encompasses about 3.5 GB/day of compressed data from multiple regional servers.The collection of data from the regional servers occurs in parallel.
Process 2: Extract, Merge and Clean Up the Data: Files that occasionally get corrupted, may have extra or not enough fields, and represent different software versions.This step cleans up corrupted data and makes all files follow a specified template regardless of software version.If there are multiple files for the same femtocell (the uploading period can be set from 1 hour to 24 hours), it merges the data, removes duplicate lines, and places each line in chronological order.The processing operates directly on streams of compressed data files, eliminating a disk-intensive intermediate step of uncompressing to disk.On any given day, data from several hundred out of one million units may be in some form corrupted.
Process 3: Analyze the OM Data.A metafile tells the processing program how to process each of the several hundred OM's received from each femtocell.OM's can be processed as: cumulative counters, hourly metrics, averages, minimum, maximum, snapshots, times, etc.The output of the primary processing step is a set of comma delimited spread-sheets with one line per femtocell.The OM data is merged with the configuration data and the alarm history data, indexed by MAC ID (femtocell unit identifier).The information is also indexed in separate flat files by femtocell MAC ID, by hour of the day (local time), by region, by product class, and by software version.KPI's are calculated from individual OM's according to metafiles that describe each KPI formula.Daily reports produced by the system are automatically distributed to servers accessible by different customer service teams.
Processing steps 2 and 3 are performed in a parallel architecture using the "divide and merge" algorithm illustrated in Figure 2. The ensemble of files is broken into processing units of approximately 70,000 files each.The program currently uses approximately 13 of the 16 cores of the processing server (the server has a very fast disk and disk access has been determined to be the limiting factor in processing rather than the number of cores used).When all the tasks complete, the outputs are merged together.Intermediate files are then split again and parallel tasks again run.
Process 4: Identify Poorly Performing Femtocells: The fourth step in the process is designed to create reports describing femtocells that need intervention because their performance is outside of operator specified performance thresholds.The criteria for marking a femtocell as needing attention, along with the minimum number of events defining statistical significance to declare the unit out of specification, are described in an operator specific metafile.The metafile includes the KPI or metric that is to be monitored, the trigger based on the metric, and how many events are required when the metric is to be triggered.Ideally, femtocells identified by this process should automatically trigger the download of the next level debug information such as call failure records and debug logs.Reports of poorly performing femtocells are produced weekly to get a large enough statistical sample to be significant (residential femtocells generally require about one week of reporting data to get a good statistical estimate of KPI's).
A set of heuristic-based artificial intelligence (AI) routines attempts to match known system failure signatures that indicate that the femtocell is improperly configured or not functioning properly, and these femtocells are marked for intervention by network support teams, reboot, factory reset, or parameter replanning.Currently the intervention is performed semi-automatically; a member of the team reviews the proposed changes before they are initiated.In the future this data will be fed back into the Femtocell Service Manager (FSM) automatic network planning module (ANP) for automatic action, without review.
Process 5: Create Map Based Reports for Customer Support and Performance Teams.The current implementation uses Google Earth and KML to create easy to use interactive maps showing the performance of each femtocell.One colored pin is created for each femtocell and different shapes of pins are used for different classes of femtocells such as enterprise femtocells, residential femtocells etc.The entry associated with each pushpin provides the configuration of the femtocell, KPI's and other performance metrics, and a color code indicating whether the femtocell is meeting operator performance specifications or different colors for those femtocells that need to be monitored (see Figure 3).Customer support teams use graphical tools to help understand whether the parameters from the network planning function are correct for a specific deployment scenario.It helps answer questions such as: Is the dwelling-type correct (house, large house, business, warehouse, etc.)?
Is there a close by macrocell not in the neighbor list?Graphical outputs clearly show regional or geographic dependence (see Figure 4).could use femtocells either to provide better coverage to the customer or to offload heavy home traffic users from the macrocell network.

Conclusion
The femtocell network performance management system described is currently managing close to one million small cells from multiple vendors and multiple product classes in several global networks.The system produces outputs that are sent to different teams including Tier 1, 2 and 3 customer support teams as well as RF, network performance engineering and data analysis groups.In the future the loop will be fully closed and the system will be trusted to automatically make changes to the femtocell configuration when it detects that parameters are not optimally set.

Figure 1 .
Figure 1.Architecture of the femtocell service manager and femtocell performance systems showing data flow.

Figure 2 .
Figure 2. Flow of data through the system showing the "divide and merge" architecture.

Process 6 :
Create System Level Dashboards and Identify Potential Network-Wide Issues.Dashboards are created for each product class (e.g.enterprise femtocells, residential femtocells, different femtocell hardware versions, etc. as required by the operator).The dashboards are e-mailed automatically to a wide audience each day.A special e-mail is automatically sent out to the support team if any of the large scale parameters deviates from the norm by more than 10%.The processing in step 6 is written in the .NET development environment so as to directly couple to Excel and PowerPoint and other standard presentation tools.Process 7: Produce System Wide Analytics: In recent years, the femtocell analytics platform has been used by operators to better understand customer usage patterns within their networks including volumes and duration of voice calls, SMS usage patterns at home, and data usage patterns for each femtocell.Each week and month, informatics packages are created and sent to different parts of the operator's organization.Output includes high runner femtocells in terms of voice usage, SMS, and data usage.Applications include identifying customers who

Figure 3 .
Figure 3. Graphical Interface showing performance status of femtocells.Key: Green, unit is within performance specification; Purple, unit needs to be reconfigured; Orange, unit requires attention as performance is not within required specifications.

Figure 4 .
Figure 4. Graphical Interface showing performance and configuration of an individual femtocell over one month.