At the start of this year, my friend and fellow tester Zeljko was working in Mexico, Guadalajara. You can read about his experience here. He provided a Picasa foto album, and, because he is a tester, Zeljko asked a question in this photo comment.
Here is my answer.
As I know that Željko has Croatian letter Ž at the start of his name, and that lettrer was printed as Å½, my first assumption was that Bikla application does not work with utf-8 encoding. First, I wanted to replicate the issue.
I first found one web page with letter Ž. UTF-8 hex code for letter Ž is c5bd (two bytes). In order to check has letter Ž been acctualy encoded in web page source using utf-8 encoding, save page source as html, open it in vim, switch vim to hex mode (:%!xxd), and search for c5bd. In right column, you will see text corresponding to letter Ž (but in vim hex mode it will be displayed as ..).
Other option is to check is your browser encoding set to utf-8 and if you can see letter Ž, then you know that letter has proper uft-8 hex code c5db.
In order to reproduce Bikla issue, after loading web page, I started to change my browser encoding. First in top-down list in Chromium is Western (ISO-8859-1). And browser changed Ž into Å½. Bingo, but this time I was lucky, but skilled tester would not guess, by try to use heuristics (more on heuristics you can find here) in order to determine Bikla page encoding from the first try (there is 37 encoding supported in chromium). I also confirmed that Zeljko’s name was properly encoded in the Bikla database. Question for Zeljko: how was your name entered into Bikla datbase?
Here is my heuristic. Go to google, google for ‘spanish character encoding’ and try those encoding. Google returned following encodings:
utf-8, iso-8859-1, iso-8859-e15.
So, this time, if I will go on the encoding list from left to right, I will get the Bikla encoding from the first try, but without any luck needed.
As I tester I identified following:
the issue: Ž into Å½
testing tools: vim, chromium, google
use of heuristic, how to reproduce issue
Knoweledge: character encodings.
Conclusion: I would inform Bikla that they should implement utf-8 encoding into their application. Their decision would be based on the fact does wrong printed names affects Bikla business.
For testers: find one more Croatian letter from the picture that was wrongly decoded.
In this blog post I will put all important testing tweets. My plan is to use this blog post as repository of testing tweets. All those tweets are important for every tester to become even better tester. I am using “embed this tweet” tweeter feature.
Arrange software teams so that most of the communication necessary to develop a feature happens within a single team.
— Dale Emery (@dhemery) July 15, 2012
Arrange software teams so that each team is downstream from its own work.
— Dale Emery (@dhemery) July 15, 2012
Surely the people doing the work are those who should design the work. If so, managers’ role is to test & tune the work; remove blocks.
— Michael Bolton (@michaelbolton) July 16, 2012
I would like to describe what we have learned while testing with utf-8 encoded Croatian characters.
One of our previous project issue was java virtual machine (jvm) thread hang. Web service servlet started application server thread by processing incoming xml message. Xml encoding is utf-8.
Then, using spring and hibernate frameworks, jdbc connection towards the Informix database was used in order to store the data into the database. And that thread hang infinitely in some cases in production environment. After we gathered customer bug reports, we confirmed that xml with ‘broken’ encoding caused the hang. By broken encoding we mean when some Croatian character (e.g. ‘š’ with hex code c5 a1) was encoded with some other hex code. We reproduced the case by copy/pasting part of xml message with broken utf-8 Croatian characters. That xml was in mantis bug report. At that moment we did not know how to produce those broken encoding characters. At server side we implemented the code that intercepted those broken encoding and returns appropriate error.
During the regression test, using the Python as a tool for automation, tester @majapenovic received the broken encoding error. She asked another tester that wrote the testing script about the problem and his solution was to delete Croatian characters used for generating the xml input message. This is VERY BAD TESTERS DECISSION. I told Maja to investigate what was the root of the problem.
utf-8 and Python
We learned about how to use utf-8 in Python from this excellent post.
We are using Jython for writing integration scripts. So from bottom up, you should configure following for proper utf-8 string manipulation:
- jvm that runs jython should have following java option: -Dfile.encoding=UTF-8 (You can find that option in bin/jython file.)
- At the begining of jython script: #coding=utf-8
- your editor encoding must be set to utf-8.
- your keyboard must be set to Croatian (if you work with Croatian utf-8 character set)
croCharsInUnicodeUTF8 = byteStreamReceivedFromHttp.decode(‘utf-8’)
#compare in verification check byte stream with byte stream!
For writing byte stream to file we use following code:
f = codecs.open (file_path, ‘w’, “utf-8”)
I helped Maja to check is our xml message created with broken encoding. Every xml message is stored in the database. Database encoding was set to utf-8. We unloaded the record with our xml message (informix unload statement), and used vi in hex view mode (:%!xxd) to observe the hex encoding for the character ‘š’ in xml message. At we confirmed that it had wrong hex values.
Maja started investigation. She found bug in our testing script. We called encode() method on byte stream twice in a row in different methods. That caused broken encoding of xml message.
Maja did not followed any best practice, she adopted to context of the problem. She used the existing code functionality. Write to file was called in several places, and she decided to observe the output of that methods. At some point, the output had broken encoding for character ‘š’. After that she easily spotted the line with bug in our script.
What we learned
Testers job is to find proper solutions, not to do ‘dirty workarounds’. As a result of problem investigation, we now know how to reproduce the broken encoding error.
Are you able to see in browser Croatian characters from this blog post?
Update regarding replace string method
If you need to do something like this:
request = request.replace(“placeholder”, unicode( ‘Š’, ‘utf-8’ ) )
you will get following exception:
UnicodeDecodeError: ‘ascii’ codec can’t decode byte
only if string in which you are doing the replacement (in this example request string) is not in utf-8 format. You usually put some string in utf-8 either by using decode(‘utf-8’) method, or reading a file using following code snippet:
f = codecs.open (‘file_path’, ‘r’, “utf-8”)
Update2 regarding UnicodeDecodeError: ‘ascii’ codec can’t decode byte
Today I found an excellent post about famous exception
UnicodeDecodeError: ‘ascii’ codec can’t decode byte
From now I finally understand that rather strange exception. In Python 2.x encode and decode methods work on unicode strings. That means if you try to call them on byte object, Python will implicitly try to decode byte to unicode object using default ‘ascii’ encoding. If your byte object contains bytes out of ascii range, you will trigger that exception.
The problem is that Python2.x is using ascii as default encoding because of the “historical reasons”.
…if you found in a trash bin a post it with username and password on it?
One of the important software tester job is to solve riddles. One of the great testers, Pradeep Soundararajan, is excellent in solving puzzles, but also in creating ones. This blog post is inspired by his work.
Please, put your answer as comment of this post.
Today I read this post on bug.hr
Developers of Guild Wars 2 game are kindly asking players to use they game on next Wednesday, so they will monitor stress on game servers. As a software tester, I will state what are disadvantages of this approach.
Some of those disadvantages were stated in the blog comments by users.
“I will not participate in that stress test event, because you never know when game will crash”.
Players are human being, they have fillings and emotions are triggered while playing the game. Could you guess the player emotion when game is slow or when it crashes?
They will not continue game play. That means that developers (and this is also wrong in this post, because developers never monitor servers, there is another role for that) have only one chance to record the stress on servers.
Here is how stress test should be done.
You have testers that know how to code. Testers will cooperate with developers to write virtual game client, that will generate game protocol from client to server. Project manager should agree with developers, testers and operation engineers what they want to find out with stress test. Those scenarios should be run using the load test tool that was created by testers/developers. Stress test should be minimally defined by following parameters:
- number of concurrent users
- which client scenario should be run.
Lead by our goal to put software testers on Croatian software development map, Zeljko and I held a presentation named Test like dr. House, at the Webstrategija 14. Zeljko presented Watir and SauceLabs in action, and my goal was to present rapid software testing in 5 minute demonstration.
Dr. House gave me task to find one problem with page webstrategija.com. As on following day was workshop on web application security, my decision was to find one security problem in five minutes. In order to gave a presentation, all presenters had to register using registration form. Zeljko warned me that on the presentation day, link to that form from page webstartegija was removed. So using browser history we found original link. Link was to other company that provides registration services. The problem was that collected form data was not sent to the server in encrypted form. I confirmed that by looking at the protocol information (globe icon instead of lock icon in browser address bar) of the data form service provider, and by using Chrome web development tool inspect element on the button that sends form data to the server. Protocol in html form element was http.
Sending user data through the Internet in not encrypted form is not good. I informed the audience that their data (because every one in the audience used that registration form) could be easily tempered.
Zeljko and me got the lowest evaluation feedback for our presentation. But I think that this is good. When software tester asks questions that make somebody angry (the audience in this case), that means that this question reveals some product bug or issue (this is not my statement but from one great tester that said that at the Lets Test 2012 conference).
I think that not using https for user data collection is a product bug.
What do you think?
I attended Viaqa 12, Croatian software testers conference. Conference topic was ‘Software testers education in Croatia’. As I have been a software tester for twelve years, conference organizers invited me to participate in round table discussion on that topic. Other participants were Zeljko Filipin, owner of the first Croatian software testing start up company and man that has a great experience with testing the web applications, Tomislav Buza via Skype, he writes on his popular blog about his usability testing of various web applications, and dr. Valentina Kirinic from Foi, professor that teaches students about software testing.
When I finished my formal education in 1997, there was no any subject about software testing at my faculty Fer. Everything that I learned about testing was self learning, and I started to do that when I got a question from one of my colleges from work: ‘Have you ever read any book about software testing?’ So I started with following books:
- How to Break Software: A Practical Guide to Testing, James Whittaker, 2002
- Lessons learned in software testing: a context-driven approach, Cem Kaner, James Bach, Bret Pettichord, 2001
Those books were a mind twister for me! They were the root point for gaining the software testing knowledge, because through the authors twitter connections, I found about other great testers. The important thing is that I practiced the knowledge in my daily work, and based on that I know that I am on a right path.
Thanks to Zeljko, Bret Pettichord attended Viaqa 2011 and I had a great pleasure to met him. After the confernce sessions, we talked over a beer at pub Medvedgrad, and Bret explained to us, actually presented his test report (orally), the great Northeast blackout of 2003. At that moment I comprehended what is software tester. James Bach is a software testing legend, and why is that you can check in his lecture on software testing. His video Steve McQueen Consulting Software Tester, just added more comprehension on what is software tester.
I attended Rapid Software Testing course, held by Michael Bolton, choosing that course based exclusively on Michael’s writing at his blog. For me Michael’s blog is one of the best blogs, not just from the field of software testing. I don not know how much time does he spend on every blog post, but how he puts his thoughts into the words is just amazing. And those thoughts are the gold for every tester.
So I was very interested to hear what do professors from Foi teach kids about software testing. I got a clue, based on the software testers education topic discussions form other great testers. So my fear was confirmed, a lot about standardization and maturity models. I asked do they teach them some practical testing, and the answer was that there are 30 hours of testing practice. They give them assignement to test some web application, based on the some testing standard. No guidance at all. And this is bad. This is the reason why fresh students have fear of software testing.
The good thing is that Foi professors were part of the round table. They listened what others said. They staid to the end of the lighting talks.
We had following lighting talks:
- @kisasondi about Security fuzzy testing
- @karlosmid Bela for testers, game of set.
- @zeljkofilipin about watir test automation
- and one exceptional student gave his view about education process at Foi.
At the end of lighting talk, one student, that is doing his final diploma work about testing education, approached to me and requested some materials. I gave him great presentation from testing conference Lets Test 2012 with title ‘So you think you can test’.
As a conclusion I will state my goal. To bring any of mentioned great testers to speak in front of Foi students about software testing. To show Foi students how does practical testing looks like at their 30 practical hours of software testing course.
After first Zagreb STC meeting, I mentioned that if anyone has any question about testing, I am willing to help via my Skype account (karlo.smid). I picked up the idea from Michael Bolton. One month after the meeting, I held my first Skype session. Here is chat transcript:
Wednesday, December 14, 2011
[8:55:08 PM CEST] Karlo Smid: Hi!
[8:55:12 PM CEST] Hrvoje: Hi!
[8:55:23 PM CEST] Karlo Smid: We can start.
[8:55:32 PM CEST] Hrvoje: Ok
[8:55:38 PM CEST] Karlo Smid: Describe me application technology.
[8:56:01 PM CEST] Hrvoje: Java application running on linux server.
[8:56:04 PM CEST] Hrvoje: With mysql database.
[8:56:40 PM CEST] Hrvoje: Is that what you meant by technology?
[8:56:43 PM CEST] Hrvoje: Could we switch to call?
[8:56:57 PM CEST] Karlo Smid: Ok, so basic application architecture is application server + database.
[8:57:58 PM CEST] Karlo Smid: I would like to continue with skype chat, so I will have archive about this session.
[8:58:05 PM CEST] Hrvoje: Ok.
[8:58:37 PM CEST] Karlo Smid: Do you know approximate number of application users?
[8:59:16 PM CEST] Hrvoje: I am trying to reconstruct testing network, this application will be used for performance and functional testing.
[8:59:54 PM CEST] Hrvoje: In production environment, application has few tens of requests per second. This is pick traffic.
[9:00:16 PM CEST] Hrvoje: Average is 2-3 requests/sec.
[9:00:32 PM CEST] Karlo Smid: Web service requests?
[9:01:16 PM CEST] Hrvoje: No, the most frequent protocols are: HTTP, SOAP, UCP, SMPP.
[9:01:26 PM CEST] Karlo Smid: For test network you mean testing environment?
[9:01:45 PM CEST] Hrvoje: For our customer our application is blackbox, there is no any interface.
[9:01:49 PM CEST] Hrvoje: Yes.
[9:02:39 PM CEST] Hrvoje: What’s bothering me is that they are trying to introduce virtual machines instead of real hardware. I am not sure how application will operate on virtual machines. I do not know how to justify order for real hardware servers.
[9:03:24 PM CEST] Karlo Smid: You finished development of the application?
[9:03:32 PM CEST] Hrvoje: Yes.
[9:03:58 PM CEST] Hrvoje: We used virtual servers for functional tests, are there was no any problem with that.
[9:04:00 PM CEST] Karlo Smid: Do you have any testing server for your application?
[9:04:05 PM CEST] Hrvoje: Yes.
[9:04:21 PM CEST] Hrvoje: 20 for production and 2 to 3 for testing.
[9:05:05 PM CEST] Hrvoje: Application is running in production environment.
[9:05:15 PM CEST] Karlo Smid: How complicate is to simulate application data traffic? I assume that you have done that for one user, because you finished functional testing.
[9:05:49 PM CEST] Karlo Smid: Ok, that mean that you can record live application traffic.
[9:05:55 PM CEST] Hrvoje: Yes.
[9:06:42 PM CEST] Karlo Smid: Which application server do you use in production?
[9:07:27 PM CEST] Hrvoje: tomcat
[9:08:33 PM CEST] Karlo Smid: Lets define your problem. You have in production real linux hardware, and customer wants to replace them with virtual servers. You do not will virtual servers be able to handle real data traffic?
[9:09:06 PM CEST] Hrvoje: This is not the problem, but it is very close to my problem.
[9:09:23 PM CEST] Karlo Smid: What is your problem?
[9:11:16 PM CEST] Hrvoje: We want to create new test environment for our core application and 5-10 supporting applications. Management approved virtual technology for this test environment.
[9:12:35 PM CEST] Karlo Smid: Have you ever simulated live traffic with some load test tool?
[9:12:43 PM CEST] Hrvoje: I would like real hardware for that new testing environment, but I do not how to argument that to my management.
[9:13:11 PM CEST] Hrvoje: We used JMeter and internally developed applications.
[9:15:37 PM CEST] Karlo Smid: I am using virtual servers (Solaris 10 zones), mostly for functional testing. Zones do not have any restrictions (quotes) on shared hardware, so we are doing load test on them.
[9:16:33 PM CEST] Karlo Smid: For application functionality, virtual environment and OS on the actual hardware are the same.
[9:17:26 PM CEST] Karlo Smid: Of course, OS patch levels, java virtual machine version and settings, OS settings must be the same.
[9:18:14 PM CEST] Karlo Smid: You just have to dimension host hardware for the needed number of virtual machines.
[9:19:09 PM CEST] Karlo Smid: What is virtualization technology?
[9:19:19 PM CEST] Hrvoje: It is very hard to simulate real traffic, because there is great number of traffic use case possibilities.
[9:19:36 PM CEST] Hrvoje: I think that it is VMware.
[9:21:10 PM CEST] Karlo Smid: For virtualization server host machine, you also need to have enough disk space, not just big amount of physical memory. The reason is swap memory configuration on disk drive.
[9:22:05 PM CEST] Karlo Smid: Real traffic consists of great number of different requests?
[9:23:20 PM CEST] Hrvoje: Yes, every mobile operator for every country have different settings for SMS/MMS services.
[9:23:45 PM CEST] Hrvoje: Requests differ in application and database routing.
[9:24:54 PM CEST] Hrvoje: Requests have common parameters, but request handling depends on service type.
[9:25:35 PM CEST] Karlo Smid: Have you heard of http://www.pairwise.org/?
[9:25:43 PM CEST] Hrvoje: No
[9:27:01 PM CEST] Karlo Smid: This is useful technique when you have great number of test cases. It reduces the number of test cases by keeping test case coverage.
[9:27:36 PM CEST] Karlo Smid: There is tool for that http://www.satisfice.com/tools.shtml
[9:29:24 PM CEST] Karlo Smid: Your goal is to predict the application behavior for those request combinations?
[9:30:22 PM CEST] Hrvoje: Yes, but for load test, not for functional test.
[9:31:02 PM CEST] Hrvoje: Our system has two user roles: operator and customer.
[9:31:28 PM CEST] Karlo Smid: Could you record production application SQL queries?
[9:32:01 PM CEST] Hrvoje: Yes
[9:32:32 PM CEST] Hrvoje: From log files or from source code, but there is a lot of them.
[9:32:49 PM CEST] Hrvoje: And I do not see the purpose of using grep on log files.
[9:32:59 PM CEST] Hrvoje: I think that grep would be hard to do.
[9:33:16 PM CEST] Karlo Smid: Your application has already been deployed in production. What do you want to measure using load test on new testing environment?
[9:34:01 PM CEST] Hrvoje: For example, we have a new customer and that new customer wants to have 70 SMS/sec. in our application.
[9:34:25 PM CEST] Hrvoje: I do not know is our application able to handle that peak traffic. And how long it will be able to handle it.
[9:35:26 PM CEST] Karlo Smid: Ok, here is what I would do.
[9:35:53 PM CEST] Hrvoje: I need arguments for real testing hardware, not virtualization solution. Or arguments for the virtualization solution.
[9:36:41 PM CEST] Karlo Smid: 1. Talk with your developers. Do they know which indexes are needed for their SQL queries?
[9:37:23 PM CEST] Karlo Smid: 2. Database has to be BIG, production data scale. What is the number of records in your production system?
[9:37:35 PM CEST] Hrvoje: Yes.
[9:38:13 PM CEST] Karlo Smid: Could you replicate production database in your test environment? Do you have that storage capacity?
[9:38:55 PM CEST] Karlo Smid: What is the deployment of the Java Enterprise application? war or ear archive?
[9:39:38 PM CEST] Hrvoje: I can replicate production database.
[9:39:52 PM CEST] Hrvoje: Several jar archives.
[9:41:08 PM CEST] Karlo Smid: Have you ever used jconsole (it is part of standard JDK)? This is java jmx client for monitoring java virtual machine (jvm) parameters.
[9:41:45 PM CEST] Hrvoje: No, is this similar to htop?
[9:41:56 PM CEST] Hrvoje: Or munin?
[9:43:42 PM CEST] Karlo Smid: No, it monitors jvm, in your case Tomcat instance. jmx port should be activated on Tomcat instance.
[9:45:16 PM CEST] Karlo Smid: Have you ever tuned heap jvm parameters? Garbage collector parameters? These are all important jvm parameters regarding the performance. If you have strong hardware and default settings of those jvm parameters, your hardware is not used in full potential.
[9:46:05 PM CEST] Hrvoje: Yes.
[9:46:35 PM CEST] Hrvoje: System architect and developers are taking care of those parameters.
[9:46:41 PM CEST] Karlo Smid: Using jconsole you will be able to determine are those parameters properly set.
[9:47:02 PM CEST] Hrvoje: What is the format of results?
[9:47:50 PM CEST] Karlo Smid: Regarding the traffic simulation, I would write client for one case of user traffic. I prefer Grinder.
[9:49:40 PM CEST] Karlo Smid: Than I would increase the client traffic in steps of 25 concurrent clients. For that you need also hardware (linux server with quad core and 16GB or RAM could easily produce 70 requests per second.)
[9:50:25 PM CEST] Karlo Smid: Using jconsole I would monitor server jvm. grinder gives out of the box system response times.
[9:52:20 PM CEST] Karlo Smid: Have you ever got java out of memory exception in production system?
[9:52:31 PM CEST] Hrvoje: Sometimes.
[9:54:37 PM CEST] Karlo Smid: This is indication that you have java memory leak in your application, or you should tweak heap memory settings. http://www.ibm.com/developerworks/library/j-leaks/
[9:55:33 PM CEST] Karlo Smid: Have I helped you so for?
[9:55:45 PM CEST] Hrvoje: What is your opinion about jmeter??
[9:56:08 PM CEST] Hrvoje: Yes, you gave me enough input for investigation/thinking.
[9:57:46 PM CEST] Karlo Smid: I am not fond of tools which offer GUI programming because they are not flexible enough. Grinder is Java Load Programming framework. http://grinder.sourceforge.net/. You should try grinder, you or your developers. I can help to to overcome starting learning curve.
[9:59:05 PM CEST] Karlo Smid: Why grinder? Because it gives you API for concurrent programming. You do not have to worry about deadlocks, concurrent file logging. You can easily use any java API. You can use its out of the box httprequest API class.
[10:00:07 PM CEST] Hrvoje: Than you for your time and will.
[10:01:19 PM CEST] Karlo Smid: See you on the next meeting, I will give grinder presentation.
So we talked about virtualization and performance test. From the chat transcript you can see the difference between load, peak and duration performance test. We mentioned open source load testing tools Jmeter and grinder. I explained why is Grinder better tool. I also gave load test plan for the application depending on its context.
Using my twitter reading list (Tester Tower Live) I stumble upon this great four part video lecture on Eliminating Assumptions from Sean Day, professional StarCraft player, shoutcaster, and stand-up comedian. Link to video lecture was categorized in my company as against code and ethics rules of my company! Reporting this as a exception to isit department they agreed that this link should be unblocked. In this lecture Day, based on his gaming (or should I say testing) experience, lists nine assumptions that work against the player (tester) in problem solving of video games challenges. What shook my mind in this lecture are examples form video games, puzzles and riddles that Day used in explaining why those assumptions are wrong.
Enjoy in lecture!