Recent Articles

.NET: Detect String Encoding »

Here is a simple way to detect if your content is ASCII or Unicode

public void CheckForEncoding(string content)
           int i = 0;
           for (i = 1; i <= content.Length; i++)
               int code = Convert.ToInt32(Convert.ToChar(content.Substring(i - 1, 1))) ;
               if (code < 0 || code > 255)

Popularity: 2% [?]

Open Source Text Analytics »

GATE – General Architecture for Text Engineering


Apache UIMA

Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at.

UIMA enables applications to be decomposed into components, for example “language identification” => “language specific segmentation” => “sentence boundary detection” => “entity detection (person/place names etc.)”. Each component implements interfaces defined by the framework and provides self-describing metadata via XML descriptor files. The framework manages these components and the data flow between them. Components are written in Java or C++; the data that flows between components is designed for efficient mapping between these languages.


RapidMiner (formerly YALE) and its plugins provide more than 400 operators for all aspects of Data Mining. Meta operators automatically optimize the experiment designs and users no longer need to tune single steps or parameters any longer. A huge amount of visualization techniques and the possibility to place breakpoints after each operator give insight into the success of your design – even online for running experiments. On this page we discuss the main groups of operators and give operator examples for each of the groups.


NLTK is an open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.


OpenNLP provides the organizational structure for coordinating several different projects which approach some aspect of Natural Language Processing. OpenNLP also defines a set of Java interfaces and implements some basic infrastructure for NLP components.

R Text Mining

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS. R Text Mining package can be used for text analysis.

Popularity: 2% [?]

Open Source SOA with SCA Implementation »

Apache Tuscany

Apache Tuscany simplifies the task of developing SOA solutions by providing a comprehensive infrastructure for SOA development and management that is based on Service Component Architecture (SCA) standard. With SCA as it’s foundation, Tuscany offers solution developers the following advantages.

  • Provides a model for creating composite applications by defining the services in the fabric and their relationships with one another. The services can be implemented in any technology.
  • Enables service developers to create reusable services that only contain business logic. Protocols are pushed out of business logic and are handled through pluggable bindings. This lowers development cost.
  • Applications can easily adapt to infrastructure changes without recoding since protocols are handled via pluggable bindings and quality of services (transaction, security) are handled declaratively.
  • Existing applications can work with new SCA compositions. This allows for incremental growth towards a more flexible architecture, outsourcing or providing services to others.

    In addition, Tuscany is integrated with various technologies and offers:

    • a wide range of bindings (pluggable protocols)
    • various component types including and not limited to Java, C++, BPEL, Spring and scripting
    • an end to end service and data solution which includes support for Jaxb and SDO
    • a lightweight runtime that works standalone or with other application servers
    • a modular architecture that makes it easy to integrate with different technologies and to extend
    • Integration with web20 technologies


Newton is a distributed OSGi framework in which the components can be simple POJOs or wrappers around components based on other models.

Newton recognises the dynamic nature of distributed computing and seeks to address the needs of components living in this world. To this end Newton moves code around the network installing it on demand and removing it when it is no longer in use. Newton also dynamically wires up runtime service dependencies between components and rewires them as service provider components come and go.

Newton describes distributed systems using the emerging SCA standard. Newton provides a highly dynamic SCA implementation. It is able to install and manage SCA composites distributed across a large number of JVMs, continually comparing the deployed composite graph to a specified target state and making adjustments in response to failures and network topology changes.

Newton makes use of OSGi for wiring up composites within a single JVM and Jini technology for tracking and wiring up dependencies between composites in different JVMs.


Fabric3 is an innovative platform for assembling, provisioning, and managing distributed appplications, whether they are deployed to a corporate datacenter or to a cloud environment. As applications evolve from single-stack, non-integrated architectures to sets of loosely-coupled services, there is a need for a new approach to building and managing these environments. Existing middleware technologies make realizing service-based architectures uneccessarily complex and managing them costly. Fabric3 fills this void.

Fabric3 leverages SCA to provide a standard, simplified programming model for creating services and assembling them into applications. The SCA programming model allows services to be securely, reliably, and efficiently integrated without the need for applications to manage low-level communications details.

Popularity: 1% [?]

Misc C/C++ Tools »

GNU gprof is a ultimate open source profiling tool that every *nix programmer should know.

GDB, the GNU Project debugger, allows you to see what is going on `inside’ another program while it executes — or what another program was doing at the moment it crashed.

ccache is a compiler cache. It acts as a caching pre-processor to C/C++ compilers, using the -E compiler switch and a hash to detect when a compilation can be satisfied from cache. This often results in a 5 to 10 times speedup in common compilations.

Cscope is a developer’s tool for browsing source code. It has an impeccable Unix pedigree, having been originally developed at Bell Labs back in the days of the PDP-11. Cscope was part of the official AT&T Unix distribution for many years, and has been used to manage projects involving 20 million lines of code!

Popularity: 1% [?]

C/C++ Library to Detect Buffer Overruns and Underruns »

DUMA is an open-source library (under GNU General Public License) to detect buffer overruns and under-runs in C and C++ programs.
This library is a fork of Buce Perens Electric Fence library and adds some new features to it. Features of the DUMA library:

  • “overloads” all standard memory allocation functions like malloc(), calloc(), memalign(), strdup(), operator new, operator new[]
    and also their counterpart deallocation functions like free(), operator delete and operator delete[]
  • utilizes the MMU (memory management unit) of the CPU:
    allocates and protects an extra memory page to detect any illegal access beyond the top of the buffer (or bottom, at the user’s option)
  • stops the program at exactly that instruction, which does the erroneous access to the protected memory page,
    allowing location of the defectice source code in a debugger
  • detects erroneous writes at the non-protected end of the memory block at deallocation of the memory block
  • detects mismatch of allocation/deallocation functions: f.e. allocation with malloc() but deallocation with operator delete
  • leak detection: detect memory blocks which were not deallocated until program exit
  • runs on Linux / U*ix and MS Windows NT/2K/XP operating systems
  • preloading of the library on Linux (and some U*ix) systems allowing tests without necessity of changing source code or recompilation

Popularity: 2% [?]

Free ASP.NET MVC eBook Tutorial »

Free chapter from the ASP.NET MVC ebook on how to build a full fledge application.

Popularity: 2% [?]

Open Source OSGI Framework »

Apache Felix


Felix is a community effort to implement the OSGi R4 Service Platform, which includes the OSGi framework and standard services, as well as providing and supporting other interesting OSGi-related technologies. The ultimate goal is to provide a completely compliant implementation of the OSGi framework and standard services and to support a community around this technology. Felix currently implements a large portion of the OSGi release 4 specification, but additional work is necessary for full compliance. Despite this fact, the OSGi framework functionality provided by Felix is very stable.

OSGi technology originally targeted embedded devices and home services gateways, but it is ideally suited for any project that is interested in principles of modularity, component-oriented, and/or service-orientation. OSGi technology combines aspects of these aforementioned principles to define a dynamic service deployment framework that is amenable to remote management. As an example of a simple use case, Felix can be easily embedded into other projects and used as a plugin or dynamic extension mechanism; it serves this purpose much better than other systems that are used for similar purposes, such as Java Management Extensions (JMX).


Eclipse Equinox

From a code point of view, Equinox is an implementation of the OSGi R4 core framework specification, a set of bundles that implement various optional OSGi services and other infrastructure for running OSGi-based systems.

More generally, the goal of the Equinox project is to be a first class OSGi community and foster the vision of Eclipse as a landscape of bundles. As part of this, it is responsible for developing and delivering the OSGi framework implementation used for all of Eclipse. In addition. the project is open to:

  • Implementation of all aspects of the OSGi specification (including the EEG, MEG and VEG work)
  • Investigation and research related to future versions of OSGi specifications and related runtime issues
  • Development of non-standard infrastructure deemed to be essential to the running and management of OSGi-based systems
  • Implementation of key framework services and extensions needed for running Eclipse (e.g., the Eclipse Adaptor, Extension registry) and deemed generally useful to people using OSGi.



Knopflerfish is another open source OSGI framework. Led and maintained by Makewave, Knopflerfish delivers significant value as the key container technology for many Java based projects and products.

Popularity: 2% [?]

Oracle: Partition Key is Important in Queries »

The cost can be very high if you do not use it


With partition key


Popularity: 2% [?]

Oracle: Detect Locked Objects »

Useful queries at times to resolve locked objeccts in Oracle

To show lock,

select O.object_name,
from dba_objects O,
v$locked_object L,
v$session S
where O.object_id = L.object_id
and S.sid = L.session_id

To unlock,


Popularity: 2% [?]

Open Source C++ Analysis Tool »

Valgrind is an award-winning instrumentation framework for building dynamic analysis tools. There are Valgrind tools that can automatically detect many memory management and threading bugs, and profile your programs in detail. You can also use Valgrind to build new tools.


The Valgrind distribution currently includes six production-quality tools: a memory error detector, two thread error detectors, a cache and branch-prediction profiler, a call-graph generating cache profiler, and a heap profiler. It also includes one experimental tool, which detects out of bounds reads and writes of stack, global and heap arrays. It runs on the following platforms: X86/Linux, AMD64/Linux, PPC32/Linux, PPC64/Linux.

Valgrind is Open Source / Free Software, and is freely available under the GNU General Public License, version 2.


Memcheck detects memory-management problems, and is aimed primarily at C and C++ programs. When a program is run under Memcheck’s supervision, all reads and writes of memory are checked, and calls to malloc/new/free/delete are intercepted. As a result, Memcheck can detect if your program:

  • Accesses memory it shouldn’t (areas not yet allocated, areas that have been freed, areas past the end of heap blocks, inaccessible areas of the stack).
  • Uses uninitialised values in dangerous ways.
  • Leaks memory.
  • Does bad frees of heap blocks (double frees, mismatched frees).
  • Passes overlapping source and destination memory blocks to memcpy() and related functions.

Memcheck reports these errors as soon as they occur, giving the source line number at which it occurred, and also a stack trace of the functions called to reach that line. Memcheck tracks addressability at the byte-level, and initialisation of values at the bit-level. As a result, it can detect the use of single uninitialised bits, and does not report spurious errors on bitfield operations. Memcheck runs programs about 10–30x slower than normal.


Cachegrind is a cache profiler. It performs detailed simulation of the I1, D1 and L2 caches in your CPU and so can accurately pinpoint the sources of cache misses in your code. It identifies the number of cache misses, memory references and instructions executed for each line of source code, with per-function, per-module and whole-program summaries. It is useful with programs written in any language. Cachegrind runs programs about 20–100x slower than normal.


Callgrind, by Josef Weidendorfer, is an extension to Cachegrind. It provides all the information that Cachegrind does, plus extra information about callgraphs. It was folded into the main Valgrind distribution in version 3.2.0. Available separately is an amazing visualisation tool, KCachegrind, which gives a much better overview of the data that Callgrind collects; it can also be used to visualise Cachegrind’s output.


Massif is a heap profiler. It performs detailed heap profiling by taking regular snapshots of a program’s heap. It produces a graph showing heap usage over time, including information about which parts of the program are responsible for the most memory allocations. The graph is supplemented by a text or HTML file that includes more information for determining where the most memory is being allocated. Massif runs programs about 20x slower than normal.


Helgrind is a thread debugger which finds data races in multithreaded programs. It looks for memory locations which are accessed by more than one (POSIX p-)thread, but for which no consistently used (pthread_mutex_) lock can be found. Such locations are indicative of missing synchronisation between threads, and could cause hard-to-find timing-dependent problems. It is useful for any program that uses pthreads. It is a somewhat experimental tool, so your feedback is especially welcome here.

Lackey, Nulgrind

Lackey and Nulgrind are also included in the Valgrind distribution. They don’t do very much, and are there for testing and demonstrative purposes.

Popularity: 2% [?]

Open Source .NET Digg Alternative »

KiGG is a Web 2.0 style social news web application developed in Microsoft supported technologies.

MS Tooling:

  • Linq To SQL
  • MS Patterns & Practices – Enterprise Library (Logging & Caching)
  • MS Patterns & Practices – Unity
  • jQuery

Other Third party:

  • Moq
  • HtmlAgilityPack
  • DotNetOpenId
  • jQuery UI & Markitup

External Service Integration:

  • PageGlimpse, WebSnapr – For thumbnail generation.
  • Akismet, TypePad and Defensio. – Spam Protection.
  • reCaptcha
  • Gravatar
  • OpenID & Id Selector
  • Url shrinking services ( &

Open Standard implementation:

  • hAtom, hReview, hVote, xFolk etc.
  • OpenSearch
  • SiteMap (Standard, Mobile, News)
  • RSS/Atom

It demonstrates how to develop a very loosely coupled application with Microsoft tooling following the domain driven design.

Popularity: 1% [?]

Open Source Flash Movie Player »

Gnash is a GNU Flash movie player.

  • Runs standalone – Gnash can run standalone to play flash movies.
  • Browser plugin – Gnash can also run as a plugin from within most Mozilla derived browsers, such as Firefox. Gnash also has support for Konqueror.
  • SWF v7+ compliant – Gnash can play many current flash movies.
  • Streaming Video – Gnash supports the viewing of streaming video from popular video sharing sites like or
  • XML Message server – Gnash also supports an XML based message system as documented in the Flash Format specification.
  • High Quality Output – Gnash uses OpenGL for rendering the graphics on the desktop, and AntiGrain (AGG) for embedded framebuffer only devices.
  • Free Software  – Gnash is 100% free software. For more information on the GPL, go to the Free Software Foundation web site.
  • Better Security – Gnash pays extra attention to all network connections, and allows the user to control access.
  • Extensible – Gnash supports extending ActionScript by creating your own. You can write wrappers for any development library, and import them into the player.

Popularity: 1% [?]

Java: Open Source Collaboration and Learning Environment »

The Sakai CLE is a flexible, enterprise application that supports teaching, learning and scholarly collaboration in either fully or partially online environments environments. Sakai also has a robust and full-featured online portfolio system built-in. The Sakai CLE is distributed as free, open-source software, which offers the ultimate in flexibility and avoids the risks of vendor lock-in and escalating license costs.

Instructors teach in a variety of different styles using a wide array of methods. Sakai meets the needs of the institution, the individual instructor and students though its highly customizable nature. Sakai’s architecture is modular and individual instructors can select the tools they want available for their class. Or you can configure sites that are specifically designed for research collaboration or administrative work groups. And because the source code is freely available, you always have the option of changing or adding a feature that would make Sakai work event better on your campus.

General Collaboration Tool

  • Announcements: Post current, time-critical information to a site.
  • Resources: Post, store and organize material related to the site.
  • Site Roster: View a list of site participants and their pictures
  • Email Archive: Access an archive of email sent to participants
  • Wiki:Create and edit web content collaboratively.
  • Blog: Provides blogging capability for your class.
  • Calendar: Maintain deadlines, activities and site related events
  • Chat: Engage in real-time conversations with site participants
  • Discussion Forum: Create, moderate and manage discussion topics and groups within a course and send private messages to site participants.
  • Glossary: Provide contextual definitions for terms used on a site
  • Web Page: Display external web pages.
  • News: Display custom news content from dynamic, online sources via rss.


Teaching and Learning Tools

  • Syllabus: Post a summary outline of course requirements
  • Lesson Builder: Create and publish online learning sequences.
  • Assignments: Create and grade online or offline assignments.
  • Drop Box: Share files privately with site participants.
  • Gradebook: Calculate, store and distribute grade information to students
  • Tests & Quizzes: Create and manage online assessments


Portfolio Tools

  • Design, publish, share and view portfolios of work
  • Wizards & Matrices: Create structures to help site participants document and reflect upon their learning and development
  • Evaluations: Provide site participants with summative feedback on submissions to wizards and matrices
  • Reports: Build, view and export reports on portfolio-related site activity
  • Layouts & Styles: Manage pre-defined Styles used to control the visual style (fonts, colors, etc.) of Wizards and Matrices, and Portfolios
  • Portfolio Templates: Manage templates site participants use to create standardized portfolios


Administrative Tools

  • Accounts: Manage basic account information and passwords
  • Membership: View and modify site memberships
  • Site Setup: Create new sites, modify sites you own
  • Site Editor:Change the structure, content or membership of a site
  • Section Info: Manage sections within a course site
  • Super User (SU): Assume the identity of another user in the system for troubleshooting and support
  • Users: View and edit user data in the system
  • Realms: Manage roles and permissions
  • On-Line: Track server and system usage
  • Job Selector: Create scheduled data integration and data warehouse tasks

Popularity: 1% [?]

Java: Web Based Troubleshooting and Monitoring Agent »

Glassbox is an automated troubleshooting and monitoring agent for Java apps that diagnoses common problems with one-click. Drop in a .war file from and find out what’s wrong with your existing web apps, without any code changes.

The Glassbox troubleshooter uses Aspect-Oriented Programming (AOP) and Java Management Extensions (JMX) technology to monitor your enterprise Java, without forcing you to embed anything or change a single line of code. Glassbox provides a real time diagnosis of your system and cross-references it against both your service levels and our knowledge base of failures. 


Popularity: 2% [?]

Java: JUnit Max »

JUnit Max is an Eclipse plug-in that helps programmers stay focused on coding by running tests intelligently and reporting results unobtrusively. Every time you save a Java file, Max runs your tests and reports errors in the same format as compile errors. In addition, Max runs the tests most likely to fail first, so you only have to pay close attention to test results for a second (literally) before getting back to coding, even if you have a long-running test suite.

Max’s optimizing runner works because of two convenient facts:

  • Test runtimes generally follow a power law distribution–lots of very short tests and a few very long ones. This means that by running the short tests first you can get most of the feedback in a fraction of the runtime of the whole suite (assuming test failures aren’t correlated with test run length, which I haven’t verified yet).
  • Test failures are not randomly distributed. A test that failed recently is more likely to fail than one that has run correctly a bazillion times in a row. By putting recently failed (and newly written) tests first in the queue, you maximize the information density of that critical first second of feedback (before you get distracted and go check Twitter).

Popularity: 1% [?]

Java Component for Rich Text Processing »

SimplyHTML is an application and a java component for rich text processing. It stores documents as HTML files in combination with Cascading Style Sheets (CSS) originally developed by Ulrich Hilger. It has been chosen as a rich text editing component for Mind Map Editor FreeMind.


Popularity: 1% [?]

Solaris to Linux Porting »

The Solaris-Linux Porting Kit (SLPK) is a porting environment for enterprise businesses to automate Solaris to Linux migration – further reducing the TCO of a Linux solution.

Popularity: 1% [?]

Open Source Electric Diagrams Application »


QElectroTech is a free software to create electric diagrams.

Popularity: 1% [?]

Java: Open Source Ftp Server »

The Apache FtpServer is a 100% pure Java FTP server. It’s designed to be a complete and portable FTP server engine solution based on currently available open protocols. FtpServer can be run standalone as a Windows service or Unix/Linux daemon, or embedded into a Java application. We also provide support for integration within Spring applications and provide our releases as OSGi bundles.


The default network support is based on Apache MINA, a high performance asynchronous IO library. Using MINA, FtpServer can scale to a large number of concurrent users.

It is also an FTP application platform. We have developed a Java API to let you write Java code to process FTP event notifications that we call the Ftplet API. Apache FtpServer provides an implementation of an FTP server to support this API.

Popularity: 2% [?]

Papers for Developers Reading »

Links to some papers for reading

Popularity: 2% [?]