Tuesday, January 31, 2012

Kakasi-java: born again

The Japanese language has several symbols, including kanji and hiragana/katakana. In software, we sometimes need to switch a text from one system to the other, and it is difficult.

Kakasi and MeCab are Open Source libraries dedicated to the problem of converting kanji to hiragana or katakana. For instance they can transform "国際財務報告基準" to "こくさいざいむほうこくきじゅん" or even to "kokusaizaimuhoukokukijun". In clear, it transforms logograms (symbols with multiple possible readings) to syllables.
That is very tricky, because for instance "経緯" can be transformed to "keii", but also to "ikisatsu" depending on the context or speaker. Kakasi sometimes gets it wrong, but usually it is not that bad. MeCab is actually better at that.

Yesterday I decided to add a "furigana" feature to my Android flashcards app. Furigana helps people read difficult kanjis, they are used a lot in mass media: books, newspapers, signs, advertisements.
Kakasi and MeCab are both conversion tools, but their internal algorithms are very different, leading to different speed/quality/simplicity characteristics. Before running to MeCab, I decided to also give Kakasi a try.

Unfortunately, Kakasi is written in C, and thus not easy to run on Android. Porting from C to Java would be possible, but before doing it I had to make sure nobody had ported it already. After multiple searches, I finally found a tar file of the blog of Kenichi Maehashi, saying "現在どこからも入手できないようです". In clear: Kakasi-java can not be found anymore on the Internet, so he uploaded the 0.4 version he miraculously found in his backups.

To make improvements and fixes possible, I took the source, compiled, tested it, wrote a little README file and created a project for it on GitHub. Code contributions are welcome :-)

The best would be a Java port of MeCab, but that does not seem to exist. MeCab has a Java binding, but it is not 100% Java, requiring JNI calls, which is not a great idea for Android.
Nicolas Raoul

Monday, January 30, 2012

Templated nodes in Alfresco 4


In the new Alfresco 4, you can now easily use document templates.
Templates are convenient for forms that employees must fill often, for instance.
Another example: here at Aegif we often write new contracts, always based on the same template.
In this article I explain how to create and use a template in Alfresco.
First log in Alfresco Share as admin, and click the repository icon:
Go to the "Data dictionary" folder, then "Node templates" folder, and upload your template:


That's all!
To use it, click "Create content..." then "By Templated Node...", and you can select a template:


The template is then copied to the current folder, and you just have to rename it.

More technical details:

  • Can be used to easily create nodes with a custom content model.
  • Convenient for nodes with a particular set of aspects and properties.
  • Node templates must me created by an administrator (or someone who has access to the data dictionary), and are usable by everyone.
  • I haven't tried but I guess you can use permissions to show a particular template only to a particular group of users.
  • I will have to check whether associations are preserved or not.
  • As I reported on JIRA, folder hierarchies can not be used as templated nodes yet.
Do you know any trick with templated nodes? Let us know in the comments :-)
Nicolas Raoul

Keywords: Créer un contenu... A partir d'un modèle. Inhlat erstellen... Nach Mustervorlagen-Node. Crear contenido... Por nodo de plantilla. Crea contenuto... Per nodo modello. 親近コンテンツ… テンプレートノード

Sunday, November 27, 2011

AnkiDroid 1.0 released!

Yesterday I released version 1.0 of AnkiDroid (flashcards app for Android).
With already 140.000 users, why is AnkiDroid only at version 1.0?
We released many 0.x versions, and now we feel the app is ready to be called 1.0!
The number of users has been multiplied by 8 in less than a year:


With 1.0, today is a good opportunity to describe how the AnkiDroid community of contributors works.

1) We do all we can to be friendly with all newcomers. We answer all questions and thank users for using the product.

2) We encourage everybody to participate, at every level. We introduce users to the bug tracker, and liberally give them rights to edit the Wiki, which makes them feel they are contributors rather than just users. Similarly, everyone is able to fork and edit the code, without having to ask the permission to anyone. Localization is the archetype of this spirit: Anyone can translate strings, and they are included in AnkiDroid automatically. Like on Wikipedia, consensus is the rule.

Why do we need to keep involving more and more people? Well, contributors have day jobs or exams, so their individual participation includes periods of activity alternating with periods of inactivity. See for instance this timeline of contributions, one color per contributor:


The huge majority of contributors are volunteers, but some people are also paid to contribute. Notably, the first Simplified Chinese localization has been sponsored by a Chinese company.

Geographically, developers have always been coming from very various countries: Egypt, Japan, Germany, Sweden, Spain, Brazil... Beta-testers come from virtually all over the globe, with even one in Antarctica. Thanks to this diversity, tricky issues with right-to-left languages or special characters are detected and fixed early. AnkiDroid is now available in 27 languages.

Feel free to comment if you have any question!
Nicolas Raoul

Friday, November 18, 2011

Targeting China? Be prepared for additional web development costs

If you want your web application to tap into the huge Chinese market, I have a bad news for you.

Until April 2011, IE6 was still the #1 browser in China. Now it is still at 37%.
That's bad news, because IE6 is an old browser with many bugs. Making your website usable on IE6 is very complex, it will typically double your UI custom widgets development costs.

If you are not targeting China, then do like Google AppsFacebook, Twitter and YouTube: don't invest time in making your website compatible with IE6. Actually 87% of the global top-30 websites offer a sub-optimal experience to IE6 users.

But if you are targeting China, be prepared. In particular:
Chrome recently became the second most-used browser in China, but guess what was the second-most used web browser earlier this year? Not Firefox, not Safari. It was Maxthon, from Hong Kong. It is based on Trident, just like Internet Explorer. There is also 搜狗, that can use IE and Chrome as a rendering engine. Finally, claiming 172 million active users, 360安全 is also based on Trident, and shows the same faults as IE6.
Nicolas Raoul

新しいオープンソースプロジェクト:Troois

ニコラです。

昨日Trooisという新しいオープンソースプロジェクトをリリースしました。Google App Engineで動作するAmazon S3のクローンです。

Amazon S3はファイルを書きこんだり、読み込んだりするアプリケーションにすごく便利です。とくに:
・クラウドの複数のノードで動作しているアプリケーション(この場合はファイルシステムを使えません)
HerokuクラウドPaaS)を使ってる場合。Herokuのファイルシステムはリードオンリーであり、Herokuのデータベースは費用が高いリソースです。

しかしAmazon S3は有料です。安価ですが、無料の選択肢があれば、下記の場合にメリットがあります:
・請求書などの複雑性を減らしたい。
・ランニングコストをゼロにしたい。例:非営利プロジェクト
・莫大な支払いのリスクを避けたい。有料サービスの場合、プログラムエラーや意図的犯行で莫大な支払いが起こる可能性があります。

Amazon S3の無料版がなかったので、作りました。Google App Engineで動作します。ソースコードをAffero General Public Licenseでオープンソースにしたので、皆さんは自由に使うことができますし、オープンにすればソースコードの変更もできます。

プロジェクトの名前は、「Troois」にしました。S3のフランス語発音「S trois」の「trois」に、Googleの「oo」を入れました。日本語の発音で「トローワ」かな…

Amazon S3と同じく、REST HTTP POSTで送ったファイルを、REST HTTP GETでダウンロードできます。

是非あなたのGoogle App Engineインスタンスで試して、フィードバックを送ってください。そして、是非是非新しい機能にご協力ください!

@nicolas_raoul

Friday, October 28, 2011

Where to run your background jobs for free?

Until now, Google App Engine was widely seen as the perfect solution to running jobs on the cloud for free.
It is indeed powerful and convenient, offering for instance cron and task queue APIs.

I have been using GAE a lot for not-for-profit projects I am involved in, and was enchanted. The dream will end in 3 days, as this email from Google told me:
"As part of Google's long-term commitment to App Engine, we are also updating our policies, pricing and support model to reflect its status as a fully supported Google product"
In 3 days, GAE will get more expensive. Many applications will switch from the free zone to the paying zone, and they have 4 options:
- Pay
- Make the jobs faster, by writing smarter code or reducing the scope.
- Let the jobs unprocessed after quota is reached, which might be acceptable for some apps.
- Switch to an alternative service.

Under the new pricing, GAE offers 9 hours of backend instance, but most jobs will run into another limit much sooner: only 50.000 database writes are OK is the free zone. So, is it time to switch to Celadon Cedar stack to benefit from Heroku's new pricing ?

PaaS offerCPU timeDatabase operations
Google App Engine9 hours50.000
Heroku720 hoursUnlimited

The CPU time is not directly comparable, but that's still quite a difference. So, where's the catch? Well, on Heroku you either have free frontend OR free backend. If you want one worker dyno for free, you must use zero web dyno. The consequence is that implementing any kind of web interface to control your delayed_jobs is a real challenge.
@nicolas_raoul

Tuesday, October 11, 2011

AnkiDroid presentation in Roppongi Hills

On Thursday I will give a presentation about AnkiDroid, in Japanese!

Place: Roppongi Hills, Mori Tower 2F, Hills Space
Time: 19:00~22:00, 2011 October 13th
Admission: 1000 JPY with one drink.

Other people will present various other creative projects, should be a lot of fun!
In the morning of the same day, I will be at the eDocumentJapan conference as a Japanese/English interpreter for IT pioneer John Newton.

Monday, October 3, 2011

ECM presentations at eDocument Japan

We are giving two presentations at the eDocument Japan conference:
- The place of Open Source in the global ECM market
- Social Content Management with Open Source software Alfresco

The presentations will be in English, translated to Japanese.
Tokyo Big Sight, October 13th, 10:00 AM and 12:20 PM.

Alfresco's CTO in person is coming to Japan for the occasion.
Organized by JIIMA (Japan Image and Information Management Association)

Wednesday, August 24, 2011

Giving a presentation about Alfresco in Shinjuku

On Thursday I will give a presentation about Alfresco in Shinjuku, Tokyo.

I still have to think about the details, but I will probably be presenting the basics of Alfresco and how to set up content management rules.

I will be speaking in Japanese.

Place: 東京都新宿区百人町2-27-6 関東ITソフトウェア健保会館
Time: 25th of August, 2011 (Thu) 19:00
Price: Free registration here

Tuesday, July 19, 2011

AnkiDroid one of "The 100 best Android apps"!

AnkiDroid has been growing a lot recently: the number of users has quadrupled in the last 6 months!

With now 75.000 installs, AnkiDroid has just been selected by makeuseof.com as one of The 100 Best Android Apps!
Congratulations to all of the team!

I will soon release AnkiDroid 0.8 with a lot of bug fixes, and a version with a totally re-engineered database and a much more efficient SRS algorithm should be out before the end of the year.

Tuesday, March 22, 2011

Alfresco accreditation


I just received the certificate for the Alfresco accreditation I passed a few weeks ago.

Three of my colleagues also managed to get it, so the whole company has just been declared an Alfresco Recognized Partner, allowing us to use the shiny green badge!

Tuesday, February 15, 2011

Applying Business Intelligence to Bug Tracking

Last week I released AnkiDroid 0.5.1, and judging by the Android Market's comments, people seem to love it :-)

Since a few releases already, an opt-in feedback mechanism sends us a report everytime a problem happens. The anonymous reports are automatically scanned by a Google App Engine application to determine whether it is a new bug or just an additional occurrence of an already known bug. The data can be exploited in two ways:

1) An online application allows one to browse the reports and bugs, see which bugs happen the most often for a given version, and associate them with issues in the bug tracker.

2) A set of Business Intelligence tools allows one to drill-down reports in a multidimensional OLAP cube, and generate reports to show any interesting findings. As a quick example, here is the distribution of crashes among Android versions. Those tools use the open source Pentaho Business Intelligence suite.

Friday, February 4, 2011

Alfresco: Categories vs. Spaces

Managing huge amounts of documents requires to know the limits of the ECM software you are using. Here is a study I performed about the limits and best strategies for Alfresco.

Categories vs. Spaces

In Alfresco, documents are usually hierarchized in spaces (kind of folder). But how about using Alfresco's "categories" feature instead of spaces?

Note: For all graphs in this article, horizontal axis = number of documents, vertical axis = time taken in milliseconds

This graph shows the time taken to show a space, based on the number of document this space contains, and the same for a category. There are no sub-categories nor sub-spaces involved.
For the same number of documents, categories show faster than spaces. That is especially true for above 100 documents. In a space, time is proportional to the number of documents. In a category, time is more logarithmic.
For huge numbers of documents, categories show in less than 3 seconds, whereas spaces take a very long time only to show Java errors related to a shortage of memory.

Impact of the spaces hierarchy on performance

To measure the impact of hierarchy, a comparison was done between two file spaces organization strategies:
(1) All files in about five spaces.
(2) Each file contained in its own 3 levels of sub-spaces (subspace1/subspace2/subspace3/file).

This graph shows the time taken by Alfresco's explorer to show a category, based on the number of documents that are shown.
Surprisingly, having the files scattering in a lot of different folders is more efficient.
Alfresco seems to have difficulties handling many files in the same space.
This has to be taken into account when analyzing the category performance tests, they use strategy (1), the slowest.

Impact of the number of categories applied

This graph shows the time taken by Alfresco's explorer to show a category, based on the number of documents in this category.
Two tests have been done, with a different numbers of categories.
As one would expect, if each document has 3 categories, it is faster than if each document has 10 categories.

Impact of the size of the repository on performances

This graph shows the time taken by Alfresco's explorer to show a category, based on the number of categorized documents in the repository.
Each document has 3 categories randomly selected from a pool of 20 existing categories.
The different curves show different usages of the explorer:
- Navigation in the categories tree view, with subcategories inclusion checked/unchecked.
- With or without 10000 additional uncategorized documents.
- First click or after three clicks (to measure cache performance)
- Search in a root category (Software Document Classification) including subcategories
- Search in a leaf category (Configuration Description)

Performance seem to be proportional to the number of documents in the repository at first, and then become more stable after 6000 documents.
Cached requests don't take more than 3 seconds, even with a repository of 160000 documents, which means a result set of 24000 documents.
On the contrary, search time grows consistently with the size of the repository.
The light blue curve's values are surprising and might be an artifact to the state of the database during the measure. Usual values are expected to be closer to the brown curve.

Method

Using Google Chromium and its Speed Tracer extension, I measured the time between the DOM click and the end of the processing (excluding repaints that occur after the page is shown completely).

Conditions:
- Alfresco Enterprise 3.2 with heap.maxsizesize = 500MB
- Ubuntu Karmic 2009.10 with Sun Java HotSpot 1.6.0_15
- Laptop with Intel Core Duo T9600 2.8GHz and 4GB RAM

Notes:
Empty categories are shown in about 600 milliseconds. But once, with 10000 categorized documents plus 10000 uncategorized documents, a particular empty category took 6 seconds to load, consistently.
Even for a well-defined operation, performance is not very predictable.

Conclusion

Categories show faster than spaces in the Alfresco explorer, especially when they contain large numbers of documents.
On huge repositories, performances are slow at start, but that get better once requests have been cached, most pages take less than 3 seconds to load.

Depending on the requirements, performances of categories might be deemed acceptable, there is no bottleneck or operation that takes more than 10 seconds.

However, some features are not available to someone who would use the Alfresco's “Categories” tree view exclusively, for instance:
- No permissions settings based on categories.
- No content rules settings based on categories.
- No "Add content" button when browsing categories.

Monday, January 24, 2011

Just passed the Alfresco accreditation

Back in 2009, I was in Milan designing the future Alfresco accreditation tests. Those tests will be offered to anyone, starting from summer 2011.

Because my company Aegif is an Alfresco partner, we have just been subjected to the test! Some of the questions have been written by me (what? unfair?), but all-in-all it was a bit more difficult than I expected. There are 137 questions (some multiple choice, some multiple response) to answer in 60 minutes. Some questions are very specific (which file does what) and some more general (which feature is not available).

Now I can add "Alfresco Recognized Developer" to my titles ;-)
More importantly, my company becomes an "Alfresco Recognized Partner".

Wednesday, January 12, 2011

Automatically deploy Alfresco WCM content to an FTP server

In Alfresco WCM, deploying means generating "baked" web content from XML content and templates, into a local FSR directory. Here is how to take this further and also deploy the content to your web server via FTP automatically.
  1. Make sure you have both Alfresco WCM and an FSR (now called File System Deployment Target) installed and working.
  2. Edit the Deployment Server's deployment/default-target.xml file and add a "postCommit" section linking to a postcommit script (example).
  3. Create the postcommit script, calling the lftp tool in mirror mode (example).
Unfortunately, the default Alfresco Deployment server does not report anything about the script's activity and potential errors. To see or log messages, please download Alfresco's source code, modify ProgramRunnable.java like this, recompile, overwrite alfresco-deployment-3.3.2.jar with the one you just generated, and then restart the server.

Saturday, December 4, 2010

Video of my AnkiDroid presentation at Roppongi Hills

Here is a video of my recent speech in Roppongi Hills about AnkiDroid. About 200 people attended, including a Japanese company who is considering reusing the code as a part of their educational offering. I licensed the slides under a Creative-Commons-Share-Alike license, so feel free to reuse or modify them and speak about AnkiDroid at other events!

Tuesday, November 16, 2010

Aggregation for Pentaho/InfoBright

Infobright is a column-oriented database, so it is efficient for Business Intelligence (InfiniDB is worth checking too). But my current BI project has 24 dimensions so I also needed aggregates to reach a good level of performance.

Infobright is usually more or less supported by most of the Pentaho suite, but sure not by the Aggregation Designer. I must be one of the first persons who tried, I have found many bugs and workarounds. It was a tough ride, but finally I have found a way to make it work, so here it is!

1) Open PAD and design your aggregate the usual way. Ignore the primary key errors.
2) When you're done, in "Export and Publish", execute the DDL.
3) Try executing the DML, but chances are it will fail. This is probably because Infobright does not support INSERT well.
4) So instead, click on "Preview" and copy-paste the DML SQL code.
5) Open PDI and create a "Table Input" step with the copied SQL's select-from-groupby portion.
6) Connect it to an Infobright Loader step that will write the data into the table created by the DDL.
7) Run the transformation. You can use cron and pan.sh to run it automatically every night.
8) Back to PAD's "Export and Publish", you would normally publish your updated schema, but it results in a NullPointerException.
9) So, export your updated schema
10) Open it with Schema Workbench, ignore the primary key errors, and publish from there.
11) That's all!

Tested with: pad-ce-1.2.1.RC1 biserver-ce-3.7.0.RC1 pdi-ce-4.1.0-RC1 infobright-3.4.2-0

Friday, November 5, 2010

Automatic report generation now possible in Pentaho Data Integration

Last Friday, Pentaho Data Integration (PDI) developer Matt Casters posted a preview of a new tool that allows Business Intelligence designers to include report generation (PRD) as a step of PDI. This is extremely useful, because the obvious step after ETL is often to generate reports.

Let's say a retail chain wants to send, everyday, to every shop manager, a report detailing this shop's performance and trends.
Imagine you have a data warehouse that contains all sales records, and a PRD report template. Then here is how to create a system that will automatically generate and send the reports everyday:
  1. Install the bleeding-edge PDI 4.1.0 RC1
  2. Add Matt's plugin as explained here
  3. Open PDI "Spoon" and create a new Transformation
  4. First, create a "Table Input" step to get for each shop it's code, name and email address.
  5. Second, create a minimal JavaScript step to compute the output's filename and set the PRPT file's name.
  6. Third, use the new "Pentaho Reporting step", and configure it to use your freshly computed PRPT and output filenames, as well as the shop codes.
  7. Finally, create an "Send mail" step and set it to use each shop's email address, putting the generated report as an attachment.
  8. Save and configure your cron to launch this transformation every night via the pan.sh command-line tool.

Tuesday, October 12, 2010

Presentation about AnkiDroid in Roppongi Hills

I will give a presentation at Hills Breakfast on Friday 10/22, 7h45, in Mori Tower, just at the left of the Goldman Sachs entrance.
I will be speaking about AnkiDroid and Open Source in general.

Last month's presenters were the Director of the Mori Art Museum and two Google managers.
My friend Yuko Mizutani of Mori Building Co. is organizing the event and invited me to give a talk. I first thought I would talk about ECM, but she figured out it would sound too much like an advertisement for my company, so I will be advertising Open Source instead!

Tuesday, August 17, 2010

Released AnkiDroid 0.4.1

I just released AnkiDroid 0.4.1, result of the collaboration of about 8 developers and tens of other contributors!

AnkiDroid will always be open source, but it is starting to get attention from companies as well: a Chinese company contributed the Simplified Chinese user interface. This version brings in a lot of new features:

First, the user interface has been translated by volunteers in 13 languages:
- Portuguese
- Swedish
- Romanian
- French
- Spanish
- Italian
- Simplified Chinese
- Traditional Chinese
- Catalan
- Russian
- Polish
- Greek
- German

Second, you can now download decks that have been made by other people. Hundreds of good-quality decks are available in a multitude of topics: sciences, law, languages, medicine, etc.

And lots of other improvements!
In Google's Android Market, search for "Anki".

Monday, July 12, 2010

Business Intelligence with Pentaho

Today I started a new BI (Business Intelligence) project, it had been a long time. It is my first time using Pentaho, an open source BI suite. Pentaho's website is a bit confusing about the various programs and their goals, so I lost some time trying to figure out before I found this nice guide. It explains how the BI server and the design tools work together, and how to use them. Also, Prashant Raju has a great guide that shows how to replace Pentaho's default HSQL database with a more powerful database such as MySQL or PostgreSQL. Infobright looks even more promising.

In other news, I just had a meeting with Liferay's CEO Bryan Cheung to talk about our strategy to make Liferay a success in Japan, and about Thursday's seminar, to which you are all welcome!

Friday, July 2, 2010

Organizing a Liferay seminar

We are organizing a seminar about Liferay, the open source Enterprise portal software described as visionary by Gartner. Liferay Portal powers large companies' public-facing websites with advanced content management features. Behind the scene, Liferay offers companies a "social office" where employees can collaborate efficiently.

Liferay's CEO Bryan Cheung will introduce Liferay, present the roadmap, in particular for the Japanese market.
Aegif recently became the first Liferay Service Partner in Japan.

The seminar will take place at Roppongi Hills, 49th floor, on July 15th 2pm. Free entrance. More info here.

Tuesday, June 22, 2010

Just released the first integration-oriented CMIS explorer webapp

In a lot of ECM projects, employees who are not back-office-power-users are offered a simple web-based interface to browse a portion of the documents.
Sometimes, clients are offered another web-based interface to check their bills, read their contracts or other papers.

Just released as open source by Aegif (Japan), Struts2CmisExplorer is a new way to build those kinds of web interfaces. It is a CMIS explorer application that focuses on simplicity. This means you can very easily get it running, and modify it to fit your extra requirements, or integrate it into your existing portal.

It should work with all CMIS servers (Documentum, Nuxeo, Open Text, FileNet, etc), for instance here is how to get it up and running on Alfresco:

1) Install Alfresco 3.3 (Community or Enterprise). Check that it is running well at http://localhost:8080/alfresco

2) Download Struts2CmisExplorer and put it in Alfresco's tomcat/webapps directory.

3) That's all! Use it at http://localhost:8080/Struts2CmisExplorer_0.1

Struts2CmisExplorer is not intended to be a full-featured explorer, instead it targets the usual need for a simple web-based documents access. The goal is reusability/ease of integration. Struts2CmisExplorer does not rely a particular framework, dependencies are kept to a minimum (Struts2 and OpenCMIS), which means you can easily integrate it in any IoC framework you might want.

More on Struts2CmisExplorer.

Wednesday, June 9, 2010

CMIS has been approved

CMIS has recently been accepted as a standard, so it is time to get some experience with it! Actually my company just delivered its first CMIS-based solution yesterday.

CMIS means "Content Management Interoperability Services". It is a protocol to access ECM (Enterprise Content Management) repositories. Why did I choose to use CMIS instead of JCR or WebDAV? Because CMIS better targets the needs of ECM projects, it is actually the first protocol designed for Document Management, with interoperability being one of the main goals. CMIS is a joint effort between IBM, Microsoft, Alfresco, EMC, Open Text, SAP, Oracle, Adobe, Nuxeo and others.

Some client implementations already exist in beta versions, but their documentation is still very scarce. I chose OpenCMIS as a client library. Other solutions could have been chemistry-abdera, which is not as active, or using CMIS as Web Service or REST directly, which would have taken a lot of time.

Like for other CMIS implementations, OpenCMIS documentation is still scarce (I actually contributed a good portion of it). So I will explain how to get started with CMIS in the small OpenCMIS tutorial below, hope this helps someone. For the example, I will explain how to access Alfresco from an OpenCMIS client.

Using OpenCMIS to access an Alfresco repository

OpenCMIS has not been released yet, so you will have to compile it from source. Install subversion and maven if you don't already have them. Then get the source:
svn checkout https://svn.apache.org/repos/asf/incubator/chemistry/opencmis/trunk

... and compile it (in the root directory):
mvn clean install

This will create a bunch of JAR libraries, in particular those we need:
  • chemistry-opencmis-client-api-0.1-incubating-SNAPSHOT.jar
  • chemistry-opencmis-client-impl-0.1-incubating-SNAPSHOT.jar
  • chemistry-opencmis-client-bindings-0.1-incubating-SNAPSHOT.jar
  • chemistry-opencmis-commons-api-0.1-incubating-SNAPSHOT.jar
  • chemistry-opencmis-commons-impl-0.1-incubating-SNAPSHOT.jar
If you don't have an Alfresco server ready, install Alfresco Community (version 3.3 or later) and start it.

Download my small Java demo file and edit the URL, user, password to match your environment.

Then compile with the libraries, run, and the content of your repository should appear :-)
That's it! To start doing actually useful things, check the OpenCMIS cookbook.

Saturday, March 13, 2010

My contribution to IEEE's Pervasive Computing


QR codes are big in Japan. Actually, last year I spotted one in Shibuya that is wider than most flats in Tokyo. I took a picture.

Incidentally, the IEEE is publishing an article on QR codes in this month's Pervasive Computing magazine. If you buy it, you will see "Courtesy of Nicolas Raoul" under the first picture of the article :-)

"Pervasive computing" is a synonym of ubiquitous computing, the science of ultra-close human-machine interaction. As the article points out, QR codes are not widely used outside of Asia. But I think they will eventually become popular in the whole world. Recent Android phones are natively able to read QR codes!

Monday, December 7, 2009

Helpdesk software solutions

If you are selling software products in Japan, you might need a helpdesk tool to keep track of your clients' questions and problems. I performed an evaluation of 10 helpdesk software solutions and here are the results. I first selected 30 tools (including bug trackers), and restricted my choice to ten tools that are open source, actively maintained, and localized in Japanese.

In the QSOS spirit, I defined criterions and filled the matrix. I gave a weight to each criterion according to what is important to me, but using this OpenOffice spreadsheet you can input your own weighings according to what is important for your company.

My winners are SiT! and GLPI.
But OTRS would be above if it were officially localized in Japanese.

Saturday, October 24, 2009

Drupal 7 semantic by default, when will Alfresco follow?

My university friend Stéphane Corlosquet has spent the last few months adding a very exciting feature to Drupal for its imminent next release: Drupal 7 will expose your website's structural information as RDFa, by default!

This has huge implications. Drupal being one of the most popular CMS, it handles a significant proportion of the Web's information. So Drupal 7 will effectively make the Semantic Web much bigger. Furthermore, a website's manager will now be able to define the website's ontology based on existing ontologies, which means each Drupal website will now be both a consumer as well as a producer of semantic information.

For website owners, an immediate result is that their website will be better understood by Google. But the Semantic Web is much more than an SEO trick. It is a way to make information more useful, more exact, and to make it understandable by computers.

So how long before Alfresco follows? The open-source Enterprise Content Management System should make sure the content it manages is understandable by both humans and machines. A company's way to structure information should not be kept in the Data Dictionary, it should be exposed as RDF Schema. The metadata, including aspects, should be accessible via RDF in addition to the usual REST/Java/JavaScript/JCR/CMIS APIs.

Tech videos

Some recordings of my Semantic Web-related presentations are available on video sharing websites. Here are my presentations at the Tokyo Linux Users Group and at the Yokohama Linux Users Group.

Thursday, September 10, 2009

Released AnkiDroid 0.2

I just released AnkiDroid 0.2 !
You can install it on your Android phone using the Google Market. AnkiDroid is a memorization software with already 900 registered users. Of course, it is open source.
Many new features for this version 0.2:
- Basic spaced repetition.
- Preferences dialog to enable various things.
- Sample deck.
- Starts much faster.

Wednesday, August 5, 2009

Authorized trainer of Alfresco WCM

I am now an authorized trainer of Alfresco WCM! I am officially authorized to teach developers how to implement solutions using the Alfresco Web Content Management system. Here is the detailed review:
Nicolas was assigned a difficult topic to deliver, but delivered it well and confidently. He clearly has a good grasp of the material and is very knowledgable on WCM. He provided a good overview and delivered information without reading the slides.

Nicolas is comfortable presenting in English and gave full and detailed descriptions of some complex AVM concepts. Nicolas answered questions competently and should be able to train the complete course well and enthusiastically.

Reviewed by Ben Hagan and Carlos Miguens (Alfresco Software, Inc.)

Monday, July 27, 2009

First day at Aegif's Roppongi office

Last week's Alfresco gathering in Milan was great! After defining the future Alfresco certifications, we wrote hundreds of questions. I mostly wrote API questions, so people who try and pass the Alfresco API Developer certification will probably get a few of my questions :-) All of the answer are in the documentation, of course. We were also trained to become Alfresco WCM trainers.

I came back from Italy yesterday, and am now working at the 28th floor of the Mori Tower in Roppongi, Tokyo. I am working in Japanese. This afternoon, I replied to support requests in the Japanese-language Alfresco forum, and updated the Alfresco Wiki after investigating on export-import in Alfresco Enterprise.

Tuesday, July 21, 2009

Working in Milan

This week I am working in Milan in the offices of Sourcesense. We are about twenty people from Brazil, Australia, Europe, India, the U.S. and South Africa, our goal is to brainstorm and define how Alfresco certifications will look like: Which different certifications will be proposed, what will be their content, price, modalities, difficulty, and many other things. We reached a consensus on most things, and tomorrow we will produce sample questions for each of the certifications.

In other news, I released OxygenGuide 0.3, which fixes bug #6, and AndroidBigImage 0.1.

Tuesday, July 14, 2009

Got hired to work on Alfresco

I got hired by Aegif (イージフ) to work on the open source ECM Alfresco! I would like to thank my former Japanese employers W3C Tokyo and Expresso for the great projects and wonderful time I had with them!

For my first week, my company sends me to Milan, Italy for an Alfresco "Train the Trainer" training.

Saturday, July 11, 2009

The Guardian takes me as a reference!

In an article about the Google Chrome OS, British newspaper The Guardian cites a Wikipedia article I had created two years ago, that's funny! The extract they cite actually did not change much since I wrote it. The article is about Splashtop, an instant-on Linux distribution. I used to write it because Splashtop seemed like a very promising technology to me, even though it had not been released at the time. Splashtop now ships with Lenovo, LG and Sony products; and with most Asus motherboards.

My new open source project

I just released AndroidBigImage, an open-source library for Android.

Are you developing an Android application in which the user reads a static map, a comics page, a cheatsheet, a book page, or any other kind of big image? Such an image does not fit on most devices' screen, and that's where AndroidBigImage comes in: integrate a few Java files into your application to let your users display, zoom and scroll big images!

Sunday, June 28, 2009

Just published AnkiDroid on the Market

Anki already helped me memorize 7000 Japanese words, so having this software on my Android phone would be pretty helpful in the subway, I thought.

So today I published AnkiDroid on the Market, as the result of a combined effort with Damien Elmes, Andrew Dubya and Casey Link. To install it on your Android phone, just click on "Market", search for "anki", and click "Install". 57 people installed it in two hours, so far. Here is the source code.

Update:
Now listed in Market's "Popular applications"!

Developing on an Android device using Ubuntu 9.04

Google's documentation is sometimes out of date, and on this topic it was clearly erroneous, so for anyone interested, here is how I managed to bridge an HTC Magic and Ubuntu 9.04 Jaunty to allow debugging over USB:

On the phone, in Settings/Applications/Development, check the box "USB debugging".

On the computer, install the Android SDK, change to the "tools" directory, then log as root and create file /etc/udev/rules.d/50-android.rules with this content:
SUBSYSTEM=="usb", SYSFS{idVendor}=="0bb4", MODE="0666"
and file /etc/udev/rules.d/90-android.rules with this content:
SUBSYSTEM=="usb", ATTR{idVendor}=="0bb4", MODE="0666"

Then type this:
sudo chmod a+rx /etc/udev/rules.d/50-android.rules
sudo chmod a+rx /etc/udev/rules.d/90-android.rules
sudo /etc/init.d/udev restart
./adb kill-server

You should get something like this:
./adb devices
List of devices attached
HT963LF01297 device

Monday, June 22, 2009

My Semantic Web slides translated to Japanese

I just discovered that my Semantic Web slides have been translated to Japanese! Your can find the translated content here. Many thanks to AT-Corp for this!

Sunday, June 21, 2009

Created OxygenGuide, an open-source offline travel guide

A lot of people were asking for it, so this weekend I developed OxygenGuide and released it under an open-source license.

Imagine you are traveling around the world and suddenly find yourself in Riyadh looking for a restaurant or a place to sleep. Carrying travel books is a pain, and browsing the Internet on your mobile phone abroad will probably cost a lot.

That's where OxygenGuide comes in: It is a compact offline travel guide that takes only 150MB of your smartphone's storage space. The data is based on Wikitravel, but customized for small devices. With this world travel book on you mobile device, you will travel lighter and further. The current version works for notebooks such as the Eee PC. I will try my best to release an Android version during the next weekend.

Update: Now usable on Android too.

Thursday, June 11, 2009

Created a social rating framework today

Today I was at FujiSoft for the OpenSocial Hackathon organized by Google. Tomomichi Ono, Robert Gravina and I implemented a rating framework (and a sample movie rating social application using this framework) based on the idea I had proposed last week.

This framework allows anybody (without any hardware) to create a social rating application for social networks. This can be book ratings, news rating, or anything that can be rated. "Social rating" means that the act of rating can be shared with friends and commented on, and in addition to the average rating you see the average rating from your friends, among others.

There is a lot to do before it is usable, so contributions are welcome! This Open Source project's code is available: server-side code in Python for App Engine, client-side code in JavaScript for OpenSocial. Today was a lot of fun!

Tuesday, June 9, 2009

Google gave me an Android smartphone!

I just came back from Google Developer Day 2009, where I attended enlightening presentations about OpenSocial applications optimization, advanced Android programming, and how to push the limits of the Google Maps API, among others. I definitely recommend watching the videos when they come online. I ran into a lot of acquaintances and got introduced to very interesting people. As if it were not enough, Google offered me this shiny smartphone running Android!

Monday, June 8, 2009

Attending Google Developer Day 2009

Even though inscriptions closed months ago, today I finally managed to get what they call a "VIP invitation" for the Google Developer Day 2009 tomorrow at Pacifico Yokohama. I will skip the morning sessions because they look boring, but the afternoon session sounds very promising:
  • Life of an App Engine request
  • Java で動かす Google App Engine
  • Potential of the Social Web
  • OpenSocial in Japan
  • Google & Open Source
  • Performance Tips for Geo API mashups
  • Google Wave APIs
  • HTML5 により拓かれる次世代 Web

Saturday, June 6, 2009

Semantic Web: Information wants to be useful

Yesterday I gave a presentation about the Semantic Web at the Miracle Linux headquarters in Shinbashi, Tokyo. The attendees were 50 Japanese engineers from a lot of different IT companies. The presentation was broadcast live on ustream.tv and a video should be uploaded here soon. After my 90 minutes presentation, I talked with many attendees around pizzas and beer, very interesting questions and persons! As the organizers said:「イベントは盛況のうちに終了しました。皆様ありがとうございました」

Wednesday, June 3, 2009

Brainstorming at Google's Shibuya office

I just came back from an evening at Google, where I and a dozen people exchanged new OpenSocial applications ideas. The goal of this session was to define a few projects that will be implemented during next week's hackathon at Fujisoft.

The project I had sketched up in the train in a hurry is one those that will get implemented. It will allow anyone to easily create rating systems in social networks services, where ratings will hugely benefit from being shared and receiving friends interaction. I will describe this new project in details later. It will be open source.

Tuesday, June 2, 2009

Semantic Web for everybody

Suppose you are about to launch a great new Semantic Web application, but you don't have much hardware. The bottleneck of most Semantic Web applications is the triplestore, which can be seen as the equivalent of the SQL database in normal web applications. Companies can afford the hardware to host a big one, but volunteers (such as mashup and open source people) can not.

That's where free RDF data hosting comes in. Yes, you can get a triplestore for free. Couple that with a traditional LAMP stack (there is a lot of free LAMP hosting) and you can deploy a Semantic Web application for ¥0 ! A company called Talis is now offering free RDF data hosting, complete with remote SPARQL querying, provided the data is public domain. I hope other companies such as OpenLink will follow. I received an account from Talis' programme manager Leigh Dodds and tried it right away, loading and querying data, with success. They also provide an API that seems quite interesting, but I did not try it yet.

This will be extremely useful to open-source/open-data groups who are run by volunteers and want to enter the Semantic Web scene. In particular, I expect this will lead to an explosion in the number and quality of mashup websites on the Web.

Preparing Google Developer Day 2009 Japan Hackaton

Tomorrow evening I will be at Google's Tokyo office to prepare the Google Developer Day 2009 Japan Hackaton, where I will concentrate on OpenSocial. EXPresso CEO Tomomichi Ono and me are preparing something big that involves OpenSocial, I will let you know!

Giving a presentation at Miracle Linux

I will give a presentation about the Semantic Web on Friday (2009/6/5) at 7PM at Miracle Linux's Tokyo headquarters, in Shinbashi.

The presentation will be the same I gave a month ago, but this time in Japanese! I will try my best, but my friend Osonoi Yasushi, CEO of Open Dream, will be translating what I can't say yet in Japanese. The audience will be Japanese-speaking. Socialization activities will follow.

You can watch the event live at ustream.tv and a video will be available on YouTube a bit later.

Details on the Yokohama Linux Users Group's website.

Saturday, May 30, 2009

Mozilla Party Japan

Today the もじら組 (Mozilla Gumi) organized a party in Tokyo to celebrate 10 years of Mozilla. My W3C colleague Kazuyuki Ashimura gave a short presentation about the multimodal web, and community coordinator Asa Dotzler detailed the history and perspectives of the Mozilla project. Chao Po-chiang reported on the Taiwan group's activity, and an Ubiquity developer explained how natural language interpretation differs between Japanese and English, which was a great grammar exercise for me!

Thursday, May 14, 2009

Alfresco's CEO in Tokyo

Today I attended the Alfresco seminar organized by aegif, at the 49th floor of the Mori Tower. Alfresco's CEO John Powell exposed the strength of the open source Enterprise Content Management system, and several Japanese integration companies demonstrated their Alfresco-based offerings.

Sunday, May 10, 2009

Giving a presentation at Tokyo LUG

I gave a presentation about the Semantic Web, and how to use/contribute to it. At the top of Sun Microsystems' headquarters building, the presentation lasted for about 90 minutes and should appear soon on YouTube. My slides (PDF) are copyleft.

About 40 people showed up, they reacted well and had very interesting questions. The presentation was followed by a nomikai in Yoga and a pub in Shibuya.