History | Log In     View a printable version of the current page.  
Issue Details (XML | Word | Printable)

Key: NXP-351
Type: Bug Bug
Status: Resolved Resolved
Resolution: Fixed
Priority: Major Major
Assignee: Olivier Grisel
Reporter: Stéfane Fermigier
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Google issue summary
Nuxeo Enterprise Platform

Character encoding issues with non-latin character sets

Created: 09/12/06 22:49   Updated: 26/02/07 16:42
Component/s: Web UI
Affects Version/s: 5.0.0 GA
Fix Version/s: 5.1 M1

Time Tracking:
Not Specified

Resolution Date: 26/02/07 16:42
Require Callback: No
Participants: Olivier Grisel, Stéfane Fermigier and Thierry Delprat
Date of First Response: 18/12/06 17:12
Tags:


 Description  « Hide
From the mailing list (http://lists.nuxeo.com/pipermail/ecm/2006-December/000447.html):

"""
I'm having trouble submitting unicode values with Nuxeo EP's HTML forms.
I've tested with Nuxeo EP 5.0.0.RC1 and RC2 using both IE and Firefox.
When working directly against Nuxeo Core in Java everything works fine -
I've managed to feed and retrieve values from the repository in unicode
(hebrew) correctly.

So far I've noticed the following:

The UTF-8 meta tag appears in the HTML sources:
*<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
*(thanks to " theme-view.xml" in nxthemes-jsf-filters and
nxthemes-jsf-editor).
yet the page is being understood by the browser as ISO-8859-1 (according to
the "View Page Info" dialog in Firefox)

When inputting hebrew into the form - it is being interpreted as its 'UCS
code point'
the hebrew letter 'aleph' for instance is being submitted as "&#1488;" and
then retrieved and displayed as "&amp;#1488;" (so you just see a
long sequence of *&#Xxx;*'s instead of the actual hebrew text).

I've tried to tweak this myself by various means including editing the
sources but so far failed to find a solution. I'd appreciate your comments
on this, as it is a critical requirement for my application and probably for
many other international users.

Some useful reading resources:
http://www.sitepoint.com/blogs/2006/03/15/do-you-know-your-character-encodings/
http://ppewww.physics.gla.ac.uk/~flavell/charset/form-i18n.html
"""

 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Olivier Grisel - 18/12/06 17:12
quoting my reply on the ML:

The goal is to use the utf-8 charset for the html views but we had not yet the opportunity
to fix the JBoss configuration to actually serve utf-8 content-type instead of the default latin1.

Apparently we need add something like the following line to the main
theme-view.xml facelet.

<jsp:directive.page contentType="text/html;charset=UTF-8" pageEncoding="UTF-8"/>

But I do not know whats the equivalent for faces.

According to this page, we also need to update our default
jbossweb-tomcat55.sar/service.xml to make tomcat accept UTF-8 in URI parameters.

  http://wiki.jboss.org/wiki/Wiki.jsp?page=UTF8InPortlet

Thierry Delprat - 26/02/07 16:42
Seems to be solved.
Test case was :
 - create a note
 - add content by copy/past in Arabian
 - add content by typing in Arabian
 - check display
 - search

==> OK