I was debugging a seemingly random crash in some graph layout code. An hour later… feeling proud of myself for having fixed the problem, I wanted to file the problem in the bug tracker, only to find a bug report with the exact problem and resolution. Lesson learned: Always, always, check the bug tracker before trying to debug broken code yourself.
Archive for the 'Development' Category
Ever needed to find all k-combinations of a set? Of course! I’m pretty sure everybody has run into this problem one way or another, as part of your development work, a combinatorics assignment (eek!) or in every day life. For me, I needed to implement this for generating association rules. What better way to prototype my eventual Java implementation than to use Groovy.
def choose(def itemset, int choose) { def choose(def itemset, int choose) { def results = [] //Initialize indices int[] indices = new int[choose] for (i in 0..<choose) { indices[i] = i } boolean hasMore = true; while (hasMore) { def combo = [] for (i in 0..<indices.size()) { combo << itemset[indices[i]] } results << combo hasMore = { /* Closure to move the right-most index */ int rightMostIndex = { /* Closure to find the right-most index */ for (i in choose-1..0){ int bounds = itemset.size() - choose + i if (indices[i] < bounds) return i } return -1 }() // execute closure // increment all indices if (rightMostIndex >= 0) { indices[rightMostIndex]++ for (i in rightMostIndex+1..<choose) { indices[i] = indices[i-1] + 1; } // there are still more combinations return true } // reached the end, no more combinations return false }() // execute closure } return results }
I’ve based my implementation off one from Applied Combinatorics by Alan Tucker. First, there is an indice array that stores the k positions in the itemset. The items at these index locations are the k-combinations. The algorithm increases the right-most index until it reaches the last element of the itemset, then increases, the second right-most index and so on.
I wouldn’t recommend this implementation when dealing with large itemsets. A deficiency with this one-method approach is that a single list is constructed containing all of the combinations. This list can grow to be very large, very fast. It can be easily adapted to provide one combination at a time by refactoring the hasMore check into a separate method. This way, it would act like an iterator. It’s too bad that Groovy doesn’t have support for the do-while loop as well, otherwise the hasMore closure could have been factored out into a really cool while check.
Markers are a great feature of Eclipse and there are some great articles on creating Markers. However, I couldn’t find a good article on opening markers in an editor. So, here’s the best call sequence that I could figure out:
IJavaElement element = ...; IEditorInput input = EditorUtility.getEditorInput(element); IEditorPart editor = getSite().getPage().openEditor(input, (input instanceof FileEditorInput) ? : JavaUI.ID_CU_EDITOR : JavaUI.ID_CF_EDITOR); IDE.gotoMarker(editor, sNode.getMarker());
EditorUtility is an internal JDT class, but I couldn’t find a better way of doing this. A check on the return type of the getEditorInput call is necessary to since it can return either a file editor (for compilation units) or a class file editor.
I’ve taken some time implementing Peter Norvig’s spelling corrector in an attempt to learn Groovy, a dynamic language that compiles to bytecode and is compatible with standard Java classes and libraries.
There are a couple differences (most likely deficiencies) with my implementation. First, I use a list instead of a set when constructing the candidate word list. Second, I created a separate occurrence function in order to provide the smoothing capability for our occurrence distribution. Third, I didn’t really care much for a low line count. It’s not the LOC that matter in the end, it’s how easily you can comprehend the code!
public class SpellingCorrector { def wordoccur = [:] def words(File file) { Scanner scanner = new Scanner(file) def words = scanner.findAll{ x -> x.toLowerCase() ==~ ~/[a-z]+/ } } def train(List words) { words.each { wordoccur[it] = wordoccur.containsKey(it) ? wordoccur[it] + 1 : 1 } } def edits1(String word) { def results = [] int n = word.length() //Deletion. Remove a character. for (i in 0..<n) results << word[0..<i] + word[i+1..<n] //Transposition. Swap adjacent characters. for (i in 0..<n-1) results << word[0..<i] + word[i+1] + word[i] + word[i+2..<n] //Alteration. Change one character for another letter. for (i in 0..<n) for (c in 'a'..'z') results << word[0..<i] + c + word[i+1..<n] //Insertion. Add a letter in between the others. for (i in 0..<n) for (c in 'a'..'z') results << word[0..<i] + c + word[i+1..<n] return results } def knownedits2(String word) { def candidates = [] edits1(word).each { candidates.addAll( edits1(it).findAll { wordoccur.containsKey(it) } ) } return candidates } /** * Smoothing distribution. If the word hasn't been encountered (novel words), * we give it an occurence value of 1. */ def int occurrence(String word) { return wordoccur[word] == null ? 1 : wordoccur[word]; } def List known(List words) { return words.findAll { wordoccur.containsKey(it.toLowerCase()) } } def correct(String word) { def candidates = [word] + known([word]) + known(edits1(word)) + knownedits2(word) return candidates.max { occurrence(it) } } }
First, we don’t attempt to split words into two sub-words. For example, a common typo may be “Ihave” rather than “I have”. Second, the training and known function can definitely be improved to with support for proper nouns, stemming, and more. I think it would be a fun exercise to try and to create a simple implementation of these features, much like the SpellingCorrector.
So, Groovy has great support for regular expressions, list construction and compositions and best of all, closures! I also had a chance to play with the Groovy NodeBuilder (on a separate program), which is a great way for constructing tree structures. All said and done, I would hate to implement this in Java.
It seems that Adobe doesn’t want to be left out of the Web 2.0 office application fad with it’s Acrobat.com. It provides document writing, desktop sharing, PDF creation, and a neat online PDF reader. All of this was made possible by employing the formerly Macromedia’s Flash technology. I was initially excited about the online Acrobat reader since the Linux reader is quite slow, and the other online solutions, such as Scribd are less than impressive. However, the Flash plug-in for Linux isn’t very impressive either. Well, in any case, Adobe seems to have gotten the right idea, by starting work on an open-source Flash and certifying PDF as an ISO standard.
I’ve recently started to experiment with distributed source control systems for my personal repository. I had been using Subversion previously, but it had several issues with directories that bothered me. In addition, since my primary computer was a laptop, I also wanted to have full commit and change tracking when I was offline.
So distributed source control systems seemed to fit the bill. I looked at two systems in particular, Mercurial and git. Mercurial caught my eye because of its simplicity, and similarity with the traditional, centralized SCMs such as CVS and Subversion. However, I actually started using git first. The reason was that many open source projects had switched to git and I needed to compile several bleeding edge packages. So, I had no choice but to learn to use git. However, I couldn’t really wrap my head around it. While git is no doubt, a very powerful SCM, it was also a very complicated SCM. I took me a good hour or so before I understood how to track branches.
So, I settled for Mercurial. While I was worried that Mercurial was too immature, the fact that the Mozilla projects are also using Mercurial was very comforting.
Continue reading ‘Distributed Source Control using Mercurial’
A visual comparison between using the PHP rand() pseudo-random generator and the numbers generated by random.org, a truly random generator.
I’ve recently started using the beamer class to create slides for my presentation. Up till now, I’ve been using powerdot, and found it more than sufficient. I initially thought beamer to be far more complex than necessary. However, one feature convinced me to switch: PDFTeX and XeTeX support.
Both PDFTeX and XeTeX create a PDF directly from the LaTeX source. XeTeX is built on PDFTeX, and is of particular interest since it has added support for TrueType and OpenType fonts. For beamer presentations, this was great, since it opens up a huge selection of fonts for use in presentations. To change the default font in the document with XeTeX, use the fontspec package. The xunicode package provides additional mapping between LaTeX accents and the selected font. A third package, xltxtra provides some fixes relating to fonts.
\documentclass[xetex,mathserif,serif]{beamer} \usepackage{fontspec} \usepackage{xunicode} %Unicode extras! \usepackage{xltxtra} %Fixes \setmainfont{Calibri} \setmonofont[Scale=0.86]{Andale Mono}
Of course, you should replace Calibri and Andale Mono with a font of your choice.
Another nice package to use with PDFTeX, is the microtype package, which provides better font output. Enable the package with this line:
\usepackage[final,expansion=true,protrusion=true,spacing=true,kerning=true]{microtype}
XeTeX and PGF / TiKZ
PGF / TiKZ is a TeX library for drawing graphics using the PDFTeX and XeTeX drivers. However, you may encounter the following error message when attempting to compile a presentation with PGF / TiKZ pictures in your Beamer slides:
Package pgf Warning: Your graphic driver pgfsys-dvipdfm.def does not supported marking the current position.
Unfortunately, the included TiKZ library in the TeXLive 2007 distribution does not support XeTeX. This causes cross-picture coordinates to break, which can be used to draw arrows between various TiKZ pictures in a Beamer frame.
While we wait for TeXLive 2008, you can install the new version of PGF from the CTAN which adds support for the XeTeX driver. Simply download the package, and copy the files to your local ~/texmf/tex/ directory and execute texhash to update the TeX listings.
XeTeX and Wide Pages
Although I haven’t had much time to investigate the issue, but it seems that the pgfpages package that is used with beamer, is not entirely compatible with XeTeX. In particular, the commands:
\usepackage{pgfpages} \setbeameroption{show notes on second screen} %beamer \pgfpagesuselayout{two screens with optional second} %pgfpages
is enough to have pdflatex generate notes to the right of the slide, but on xelatex it doesn’t have any effect. This post by Tomáš Janoušek to the XeTeX mailing list noted that the problem was due to a bug in the pgfpages package. Adding the following snippet fixes the probblem:
\renewcommand\pgfsetupphysicalpagesizes{% \pdfpagewidth\pgfphysicalwidth\pdfpageheight\pgfphysicalheight% }
I’ve been doing some development on the feature modeling plug-in during the past week and have implemented several new features and bug fixes (shown below).
I’m releasing the plug-in as a development release, for now. I have started rewriting the configuration backend, but my thesis deadline is fast approaching and I will not have enough time to complete the changes in fmp. In any case, please let me know of any bugs you find, or if you have a feature request. The source code is also included in the plug-in, so feel free to hack away at it yourself if you are inclined. When the plug-in is sufficiently tested, I will merge this branch into the trunk of the CVS repository on SourceForge. Give it a try!
New Features
- New, more robust and featureful constraint view.
- Constraints are shown for the feature hierarchy in addition to additional constraints.
- When a configuration is selected in the feature model editor, the constraints are evaluated and the status of each constraint is shown (ie. satisfied or not satisfied).
- Support for arbitrary propositional formulas when writing additional constraints. NOTE: constraints are written using node Id instead of an XPath expression. However, feature models created using fmp 0.6.6 are compatible, but will require re-writing the constraints using the new grammar. See below for examples.
- Ability to view Node Ids next to feature names in the feature model.
- Constraint input validation.
- Constraint resolution. An unsatisfied constraint can be resolved in a configuration by right-clicking and selecting ‘Resolve Constraint’.
Installation
- Download ca.uwaterloo.gp.fmp_0.7.0.jar
- Compiled for Java 5 (Java 6 compatible), Eclipse 3.2
Project Homepage: http://gsd.uwaterloo.ca/projects/fmp-plugin/fmp-070/
Here’s a small snippet of code to load an Ecore resource without having to initialize all the necessary packages needed to read all elements. This is useful if we’re interested in only a subset of the schema elements that are present in the Ecore model.
public static EList open(File file) throws IOException { ResourceSet resourceSet = new ResourceSetImpl(); //Initialize the FSML Package information (ie. URI) MyPackageImpl.init(); //Set OPTION_RECORD_UNKNOWN_FEATURE prior to calling getResource Resource.Factory.Registry.INSTANCE.getExtensionToFactoryMap().put( "*", new EcoreResourceFactoryImpl() { @Override public Resource createResource(URI uri) { XMIResourceImpl resource = (XMIResourceImpl) super.createResource(uri); resource.getDefaultLoadOptions().put(XMLResource.OPTION_RECORD_UNKNOWN_FEATURE, Boolean.TRUE); return resource; } }); XMIResource resource = (XMIResource) resourceSet.getResource( URI.createFileURI(file.toString()), true); //Unknown elements will appear in this map System.out.println(resource.getEObjectToExtensionMap()); resource.load(Collections.EMPTY_MAP); return resource.getContents(); }
Be aware that any unrecognized elements will be null in the retrieved Ecore model.
I ran into a problem when using the statement-centric find(…) query using the OntModel with the RDF Vocabulary. The specified RDF resources were treated like NULL in the find function. In order to fix this, create a ResProperty instead of a ResResource, and all will be well. This can be done like so:
$rest = $this->ontModel->find($statement->getObject(), new ResProperty(RDF_NAMESPACE_URI . 'rest'), NULL);
References
My open source contribution, a bug report
: ResResource as parameter for ResModel::find does not work.
It’s unfortunately that much documentation on the Eclipse Modeling Framework (EMF) is scattered around the ‘net. After digging through the EMF newsgroups (which are immensely useful, and full of useful information!) and several articles, I pieced together how XML serialization of an Ecore model can be customized using ExtendedMetaData EAnnotations.

However, when saving the model the XMLResource.OPTION_EXTENDED_META_DATA option must be set to true in order for the EAnnotations to be effective. To do this:
Map options = new HashMap();options.put(XMLResource.OPTION_EXTENDED_META_DATA, Boolean.TRUE); options.put(XMLResource.OPTION_XML_MAP, xmlMap);
Finally, saving the model with the specified XML resource options:
resource.save(options); //Save with the options map
References
WSDL not being updated from your PHP SOAP app?
After struggling for three hours trying to figure out what was wrong with my web service, it turned out that the WSDL was being cached by the PHP SOAP extension. To disable WSDL caching, add the following lines to the php.ini configuration file:
[soap] soap.wsdl_cache_enabled = "0"
You should also delete the cached WSDL (located in /tmp/ for me).
Welcome to the new woggie.net. I will be adding more content in the future, so stay tuned!
Update: I started transferring some of the Linux guides from my old site, but found that many of them were outdated and obsolete. It’s great to find that a lot of things just work now with most Linux distributions









