生活随笔
收集整理的這篇文章主要介紹了
设置utf8编码问题
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
注意:亂碼和request的具體實現類有關,現在已經查到的是RequestDispatcher.forward調用前使用的是org.apache.catalina.connector.RequestFacade類而RequestDispatcher.forward調用后使用的是org.apache.catalina.core.ApplicationHttpRequest,他們內部在ParseParameter的時候, 用來解碼的默認的編碼邏輯不同,使用不同的協議時,影響亂碼的因素不同!? 具體參考:Tomcat 源碼分析--ServletRequest.getParameterValues內部分析,Request字符集&QueryStringEncoding? 亂碼的產生? 譬如漢字“中”,以UTF -8 編碼后得到的是3字節的值%E4%B8 %AD,然后通過GET或者POST方式把這3個字節提交到Tomcat 容器,如果你不告訴Tomcat 我的參數是用UTF -8 編碼的,那么tomcat 就認為你是用ISO-8 8 59-1來編碼的,而ISO8 8 59-1(兼容URI中的標準字符集US-ASCII)是兼容ASCII的單字節編碼并且使用了單字節內的所有空間,因此Tomcat 就以為你傳遞的用ISO-8 8 59-1字符集編碼過的3個字符,然后它就用ISO-8 8 59-1來解碼,得到中-,解碼后。字符串中-在Jvm是以Unicode的形式存在的,而HTTP傳輸或者數據庫保存的其實是字節,因此根據各終端的需要,你可以把unicode字符串中-用UTF -8 編碼后得到相應的字節后存儲到數據庫(3個UTF -8 字符),也可以取得這3個字符對應的ISO-8 8 59-1的3個字節,然后用UTF -8 重新編碼后得到unicode字符“中”(特性:把其他任何編碼的字節流當作ISO-8 8 59-1編碼看待都沒有問題),然后用response傳遞給客戶端(根據你設置的content-type不同,傳遞的字節也是不同的!)? 總結:?
1,HTTP GET或者POST傳遞的是字節?數據庫保存的也是字節(譬如500MB空間就是500M字節) 2,亂碼產生的原因是編碼和解碼的字符集(方式)不同導致的,即對于幾個不同的字節,在不同的編碼方案下對應的字符可能不同,也可能在某種編碼下有些字節不存在(這也是亂碼中?產生的原因) 3,解碼后的字符串在jvm中以Unicode的形式存在 4,如果jvm中存在的Unicode字符就是你預期的字符(編碼,解碼的字符集相同或者兼容),那么沒有任何問題,如果jvm中存在的字符集不是你預期的字符,譬如上述例子中jvm中存在的是3個Unicode字符,你也可以通過取得這3個unicode字符對應的3個字節,然后用UTF -8 對這3個字節進行編碼生成新的Unicode字符:漢字“中” 5,ISO8 8 59-1是兼容ASCII的單字節編碼并且使用了單字節內的所有空間,在支持ISO-8 8 59-1的系統中傳輸和存儲其他任何編碼的字節流都不會被拋棄。換言之,把其他任何編碼的字節流當作ISO-8 8 59-1編碼看待都沒有問題。 下面的代碼顯示,使用不同的編碼來Encoder會得到不同的結果,同時如果Encoder和Decoder不一致或者使用的漢字在編碼ISO-8 8 59-1中不存在時,都會表現為亂碼的形式!?
Java代碼??
try?{???? ?? ????????//?漢字“中”用UTF -8 進行URLEncode的時候,得到%e4%b8 %ad(對應的ISO-8 8 59-1的字符是中)?? ????????String?item?=?new?String(new?byte[]?{?(byte)?0xe4,?(byte)?0xb8 ,?(byte)?0xad?},?"UTF -8 ");?? ????????//?中?? ????????System.out.println(item);?? ?? ????????item?=?new?String(new?byte[]?{?(byte)?0xe4,?(byte)?0xb8 ,?(byte)?0xad?},?"ISO-8 8 59-1");?? ????????//?中?? ????????System.out.println(item);?? ?? ????????System.out.println(new?BigInteger("253").toByteArray());?? ????????System.out.println(Integer.toBinaryString(253));?? ?? ????????//?中?? ????????item?=?new?String(item.getBytes("ISO_8 8 59_1"),?"UTF -8 ");?? ????????System.out.println(item);?? ????????//?中?? ????????item?=?new?String(item.getBytes("UTF -8 "),?"ISO_8 8 59_1");?? ????????System.out.println(item);?? ?? ????????//?漢字中以UTF -8 編碼為?????%E4%B8 %AD(3字節)?? ????????System.out.println(URLEncoder.encode("中",?"UTF -8 "));???? ????????//?漢字中以UTF -8 編碼為?????%3F??????(1字節?這是由于漢字在ISO-8 8 59-1字符集中不存在,返回的是?在ISO-8 8 59-1下的編碼)?? ????????System.out.println(URLEncoder.encode("中",?"ISO-8 8 59-1"));???? ????????//?漢字中以UTF -8 編碼為?????%D6%D0????????(2字節)?? ????????System.out.println(URLEncoder.encode("中",?"GB2312"));???? ???????????? ????????//?把漢字中對應的UTF -8 編碼?????????????????%E4%B8 %AD?用UTF -8 解碼得到正常的漢字?中?? ????????System.out.println(URLDecoder.decode("%E4%B8 %AD",?"UTF -8 "));???? ????????//?把漢字中對應的ISO-8 8 59-1編碼????%3F???????用ISO-8 8 59-1解碼得到??? ????????System.out.println(URLDecoder.decode("%3F",?"ISO-8 8 59-1"));???? ????????//?把漢字中對應的GB2312編碼?????????????????%D6%D0????????用GB2312解碼得到正常的漢字?中??? ????????System.out.println(URLDecoder.decode("%D6%D0",?"GB2312"));???? ????????//?把漢字中對應的UTF -8 編碼?????????????????%E4%B8 %AD?用ISO-8 8 59-1解碼?? ????????//?得到字符中(這個就是所謂的亂碼,其實是3字節%E4%B8 %AD中每個字節對應的ISO-8 8 59-1中的字符)?? ????????//?ISO-8 8 59-1字符集使用了單字節內的所有空間?? ????????System.out.println(URLDecoder.decode("%E4%B8 %AD",?"ISO-8 8 59-1"));?? ????????//?把漢字中對應的UTF -8 編碼?????????????????%E4%B8 %AD?用GB2312解碼?? ????????//?得到字符涓?,因為前2字節?%E4%B8 對應的GB2312的字符就是涓,而第3字節%AD在GB2312編碼中不存在,故返回??? ????????System.out.println(URLDecoder.decode("%E4%B8 %AD",?"GB2312"));???? ????}?catch?(UnsupportedEncodingException?e)?{???? ????????//?TODO?Auto-generated?catch?block???? ????????e.printStackTrace();???? ????}???? Tomcat 關于encoding編碼的默認設置以及相關標準:? 對于Get請求,"URI Syntax"規范規定HTTP query strings(又叫GET parameters)使用US-ASCII編碼,所有不在這個編碼范圍內的字符,必須經常一定的轉碼:%61的形式(encode)。又由于ISO-8 8 59-1 and ASCII對于0x20 to 0x7E范圍內的字符是兼容的,大部分的web容器譬如Tomcat 容器默認使用ISO-8 8 59-1解碼URI中%xx部分的字節。可以使用Connector中的URIEncoding來修改這個默認用來解碼URI中%xx部分字節的字符集。URIEncoding要和get請求query string中encode的編碼一直,或者通過設置Content-Type來告訴容器你使用什么編碼來轉碼url中的字符? POST請求應該自己通過參數Content-Type指定所使用的編碼,由于許多客戶端都沒有設置一個明確的編碼,tomcat 就默認使用ISO-8 8 59-1編碼。注意:用來對URI進行解碼的字符集,Request字符集,Response字符集的區別!不同的Request實現中,對于上述3個編碼的關系是不同的? 對于POST請求,ISO-8 8 59-1是Servlet規范中定義的HTTP request和response的默認編碼。如果request或者response的字符集沒有被設定,那么Servlet規范指定使用編碼ISO-8 8 59-1,請求和相應指定編碼是通過Content-Type響應頭來設定的。? 如果Get、Post請求沒有通過Content-Type來設置編碼 的話,Tomcat 默認使用ISO-8 8 59-1編碼??梢允褂肧etCharacterEncodingFilter來修改Tomcat 請求的默認編碼設置(encoding:使用的編碼, ignore:true,不管客戶端是否指定了編碼都進行設置, false,只有在客戶端沒有指定編碼的時候才進行編碼設置, 默認true)? 注意:一般這個Filter建議放在所有Filter的最前面(Servlet3.0之前基于filter-mapping在web.xml中的順序, Servlet3.0之后有參數可以指定順序),因為一旦從request里面取值后,再進行設置的話,設置無效。因為在第一次從request取值時,tomcat 會把querystring或者post方式提交的變量,用指定的編碼轉成從parameters數組,以后直接從這個數組中獲取相應參數的值!? 到處都使用UTF -8 建議操作:?
1, Set URIEncoding="UTF -8 " on your <Connector> in server.xml.使得Tomcat ?Http Get請求使用UTF -8 編碼 2, Use a character encoding filter with the default encoding set to?UTF -8 .?由于很多請求本身沒有指定編碼,?Tomcat 默認使用ISO-8 8 59-1編碼作為HttpServletRequest的編碼,通過filter修改 3, Change all your JSPs to include charset name in their contentType. For example, use <%@page contentType="text/html; charset=UTF -8 " %> for the usual JSP pages and <jsp:directive.page contentType="text/html; charset=UTF -8 " /> for the pages in XML syntax (aka JSP Documents).?指定Jsp頁面使用的編碼 4, Change all your servlets to set the content type for responses and to include charset name in the content type to beUTF -8 . Use response.setContentType("text/html; charset=UTF -8 ") or response.setCharacterEncoding("UTF -8 ").?設置Response返回結果的編碼 5, Change any content-generation libraries you use (Velocity, Freemarker, etc.) to use?UTF -8 ?and to specify?UTF -8 ?in the content type of the responses that they generate.指定所有模版引擎佘勇的編碼 6, Disable any valves or filters that may read request parameters before your character encoding filter or jsp page has a chance to set the encoding to?UTF -8 .?SetCharacterEncodingFilter一般要放置在第一位,否則可能無效
Java代碼??
/*? *?Licensed?to?the?Apache?Software?Foundation?(ASF)?under?one?or?more? *?contributor?license?agreements.??See?the?NOTICE?file?distributed?with? *?this?work?for?additional?information?regarding?copyright?ownership.? *?The?ASF?licenses?this?file?to?You?under?the?Apache?License,?Version?2.0? *?(the?"License");?you?may?not?use?this?file?except?in?compliance?with? *?the?License.??You?may?obtain?a?copy?of?the?License?at? *? *?????http://www.apache.org/licenses/LICENSE-2.0? *? *?Unless?required?by?applicable?law?or?agreed?to?in?writing,?software? *?distributed?under?the?License?is?distributed?on?an?"AS?IS"?BASIS,? *?WITHOUT?WARRANTIES?OR?CONDITIONS?OF?ANY?KIND,?either?express?or?implied.? *?See?the?License?for?the?specific?language?governing?permissions?and? *?limitations?under?the?License.? */?? ?? package?filters;?? ?? ?? import?java.io.IOException;?? import?javax.servlet.Filter;?? import?javax.servlet.FilterChain;?? import?javax.servlet.FilterConfig;?? import?javax.servlet.ServletException;?? import?javax.servlet.ServletRequest;?? import?javax.servlet.ServletResponse;?? ?? ?? /**? ?*?<p>Example?filter?that?sets?the?character?encoding?to?be?used?in?parsing?the? ?*?incoming?request,?either?unconditionally?or?only?if?the?client?did?not? ?*?specify?a?character?encoding.??Configuration?of?this?filter?is?based?on? ?*?the?following?initialization?parameters:</p>? ?*?<ul>? ?*?<li><strong>encoding</strong>?-?The?character?encoding?to?be?configured? ?*?????for?this?request,?either?conditionally?or?unconditionally?based?on? ?*?????the?<code>ignore</code>?initialization?parameter.??This?parameter? ?*?????is?required,?so?there?is?no?default.</li>? ?*?<li><strong>ignore</strong>?-?If?set?to?"true",?any?character?encoding? ?*?????specified?by?the?client?is?ignored,?and?the?value?returned?by?the? ?*?????<code>selectEncoding()</code>?method?is?set.??If?set?to?"false,? ?*?????<code>selectEncoding()</code>?is?called?<strong>only</strong>?if?the? ?*?????client?has?not?already?specified?an?encoding.??By?default,?this? ?*?????parameter?is?set?to?"true".</li>? ?*?</ul>? ?*? ?*?<p>Although?this?filter?can?be?used?unchanged,?it?is?also?easy?to? ?*?subclass?it?and?make?the?<code>selectEncoding()</code>?method?more? ?*?intelligent?about?what?encoding?to?choose,?based?on?characteristics?of? ?*?the?incoming?request?(such?as?the?values?of?the?<code>Accept-Language</code>? ?*?and?<code>User-Agent</code>?headers,?or?a?value?stashed?in?the?current? ?*?user's?session.</p>? ?*? ?*?@author?Craig?McClanahan? ?*?@version?$Id:?SetCharacterEncodingFilter.java?939521?2010-04-30?00:16:33Z?kkolinko?$? ?*/?? ?? public?class?SetCharacterEncodingFilter?implements?Filter?{?? ?? ?? ????//?-----------------------------------------------------?Instance?Variables?? ?? ?? ????/**? ?????*?The?default?character?encoding?to?set?for?requests?that?pass?through? ?????*?this?filter.? ?????*/?? ????protected?String?encoding?=?null;?? ?? ?? ????/**? ?????*?The?filter?configuration?object?we?are?associated?with.??If?this?value? ?????*?is?null,?this?filter?instance?is?not?currently?configured.? ?????*/?? ????protected?FilterConfig?filterConfig?=?null;?? ?? ?? ????/**? ?????*?Should?a?character?encoding?specified?by?the?client?be?ignored?? ?????*/?? ????protected?boolean?ignore?=?true;?? ?? ?? ????//?---------------------------------------------------------?Public?Methods?? ?? ?? ????/**? ?????*?Take?this?filter?out?of?service.? ?????*/?? ????public?void?destroy()?{?? ?? ????????this.encoding?=?null;?? ????????this.filterConfig?=?null;?? ?? ????}?? ?? ?? ????/**? ?????*?Select?and?set?(if?specified)?the?character?encoding?to?be?used?to? ?????*?interpret?request?parameters?for?this?request.? ?????*? ?????*?@param?request?The?servlet?request?we?are?processing? ?????*?@param?result?The?servlet?response?we?are?creating? ?????*?@param?chain?The?filter?chain?we?are?processing? ?????*? ?????*?@exception?IOException?if?an?input/output?error?occurs? ?????*?@exception?ServletException?if?a?servlet?error?occurs? ?????*/?? ????public?void?doFilter(ServletRequest?request,?ServletResponse?response,?? ?????????????????????????FilterChain?chain)?? ????throws?IOException,?ServletException?{?? ?? ????????//?Conditionally?select?and?set?the?character?encoding?to?be?used?? ????????if?(ignore?||?(request.getCharacterEncoding()?==?null))?{?? ????????????String?encoding?=?selectEncoding(request);?? ????????????if?(encoding?!=?null)?? ????????????????request.setCharacterEncoding(encoding);?? ????????}?? ?? ????//?Pass?control?on?to?the?next?filter?? ????????chain.doFilter(request,?response);?? ?? ????}?? ?? ?? ????/**? ?????*?Place?this?filter?into?service.? ?????*? ?????*?@param?filterConfig?The?filter?configuration?object? ?????*/?? ????public?void?init(FilterConfig?filterConfig)?throws?ServletException?{?? ?? ????this.filterConfig?=?filterConfig;?? ????????this.encoding?=?filterConfig.getInitParameter("encoding");?? ????????String?value?=?filterConfig.getInitParameter("ignore");?? ????????if?(value?==?null)?? ????????????this.ignore?=?true;?? ????????else?if?(value.equalsIgnoreCase("true"))?? ????????????this.ignore?=?true;?? ????????else?if?(value.equalsIgnoreCase("yes"))?? ????????????this.ignore?=?true;?? ????????else?? ????????????this.ignore?=?false;?? ?? ????}?? ?? ?? ????//?------------------------------------------------------?Protected?Methods?? ?? ?? ????/**? ?????*?Select?an?appropriate?character?encoding?to?be?used,?based?on?the? ?????*?characteristics?of?the?current?request?and/or?filter?initialization? ?????*?parameters.??If?no?character?encoding?should?be?set,?return? ?????*?<code>null</code>.? ?????*?<p>? ?????*?The?default?implementation?unconditionally?returns?the?value?configured? ?????*?by?the?<strong>encoding</strong>?initialization?parameter?for?this? ?????*?filter.? ?????*? ?????*?@param?request?The?servlet?request?we?are?processing? ?????*/?? ????protected?String?selectEncoding(ServletRequest?request)?{?? ?? ????????return?(this.encoding);?? ?? ????}?? ?? ?? }
總結
以上是生活随笔 為你收集整理的设置utf8编码问题 的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網站內容還不錯,歡迎將生活随笔 推薦給好友。