Tuesday 21 October 2014

A Fun Camel Ride With Apache Camel Part Four Exploring The Basic Route Lets Process Those Files

Finally we are getting into the meat of the tutorial. We are going to start off with a basic implementation of the route then measure its performance and then start applying some more advanced techniques to increase performance.

The first thing we need to do is create a route which reads from a folder and then sends that information to a tag analyser. Apache Camel does not come with a digital music tag analyser so we will need to create a custom processor to handle this task. Now we could learn about all the different file formats and how to read the tags associated with them which is a rather lengthy task. Rather than going this way I have settled for the JAudioTagger library to read tags.  Since this is not a part of  camel we need to configure this as a maven dependency.

Double click the pom.xml file in your project explorer switch to the dependency tab and click on add a screen will popup and you can do a search on the Maven repositories from here. In the Enter groupId, artifactId, or sha1 prefix or pattern(*) text box type in jaudiotagger. Maven will do a search of the repository and you should get some results. Choose the org jaudiotagger results and expand it. Then select version 2.0.3.  This will add the dependency to the project.
Adding JAudioTagger Dependency

Now we can use the JAudioTagger library instead of the hard way of coding everything ourselves. We also need to start defining the pieces of the route. First things is to clear out the current route as it as it a example and we don't need the contents. Go to the blueprint.xml file and delete all the components in the current route. Also go the the src/main/java folder and delete the two source files there which are Hello.Java and HelloBean. Java. One last thing is to remove the  helloBean from the Camel route blueprint.xml file. Open the blueprint.xml file and go to the source tab and find the lines that read:

<bean id="helloBean" class="com.tutorial.namphibiansoftware.camelmusic.HelloBean">
<property name="say" value="Hi from Camel"/>
</bean>

Delete the lines as indicated above. At this point you will have an empty route ready to start with. It should look more or less like this:
<?xml version="1.0" encoding="UTF-8"?>
    <blueprint xmlns="http://www.osgi.org/xmlns/blueprint/v1.0.0"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:camel="http://camel.apache.org/schema/blueprint"
       xsi:schemaLocation="
http://www.osgi.org/xmlns/blueprint/v1.0.0 http://www.osgi.org/xmlns/blueprint/v1.0.0/blueprint.xsd
http://camel.apache.org/schema/blueprint http://camel.apache.org/schema/blueprint/camel-blueprint.xsd">

    <camelContext trace="false" id="blueprintContext" xmlns="http://camel.apache.org/schema/blueprint">
         <route id="timerToLog" customId="true">

         </route>
</camelContext>

</blueprint>
The name of the route is still not correct it says timerToLog so change that to CamelMusicRoute and we are ready to start with the rest of the route. To get the basics working we will need the following components:
  • File endpoint to read the source directory
  • A custom processor that uses the JAudioTagger Library and inserts information into the database.
  • A processor to write the files to their destination folders
Lets add a file endpoint to the route. Go to the Design tab of the blueprint.xml and find the endpoints folder on the palette. Click and drag a endpoint component onto the design page. Make sure the file endpoint component is selected and the properties page is showing. You will a text box labeled URI and this is where we are going to configure the file endpoint.
My configuration URI looks like follows:

file://c:/mymusic?noop=true&recursive=true&delay=3000&exclude=.*.(jpg|JPG|gif|GIF|doc|DOC|pdf|PDF|avi|AVI|Mpg|mpg|db|DB|ini|INI|txt|TXT|m3u|M3U|mpeg|MPEG)&charset=utf-16

You should up with a view that looks more or less like this:

Adding The File Endpoint

This essentially says for each file in the c:\mymusic folder and subfolder, that does not have a extension listed in the exclude filter, read the files and send them on the Camel route. The end point is also configured to run every 3000 milliseconds or roughly five seconds. Notice that I set the noop to true. This is just to make sure that we dont have to reset the folder everytime we debug/run the route. In a real route you would probably move the files to a .done folder or some similar action. Also note that I like to add nice descriptions to my routes, endpoint etc. This makes things a lot easier when you start debugging, tracing and monitoring routes and the various steps in them.  A little bit of detail now will save you a lot of brain power later.

Some notes on the file endpoint that you have to keep in mind. One of the important concepts that you have to understand that the body of the message will contain an org.apache.camel.component.file.GenericFile object it will not contain the actual file data until you touch the body and convert it.This might seem counter intuitive at first however there is a very solid reason behind this. Scalability the file endpoint must be able to scale and if it reads the contents of the file into the body it is going to hamper the ability to quickly spool a lot of files into the route. Once you access the body you will be able to access the data of the file. I answered a StackOverflow question which contains more details so I wont repeat it here.

The file component adds some information to the header of the message. This list is primarily file meta-data such as file size, path, extensions and so on. You can use this information to route files to endpoint based on extension for example.

Just to demo some of the headers I will log the full path of each file that is being read. Go to the design view of the blueprint.xml and drag and drop a log component from the palette and drop it on the form. Make sure the log component is selected and add the following line as the message:
File name: ${file:absolute.path} was read at ${date:now:yyyy-MM-dd HH-mm-ss} and placed onto the route.
I also added the ID property and a description. You should end up with a route like this:

Adding A Log Component

At this point you will be able to run the route to do a basic test that all is order. Right click the project and go to Debug As->Debug Configuration menu item. Choose your camel music maven configuration and click debug. Watch your console output window for the results. If all goes well you should see something like the following:

[mel.test.blueprint.Main.main()] Activator                      INFO  Camel activator started
[mel.test.blueprint.Main.main()] BlueprintExtender              INFO  No quiesce support is available, so blueprint components will not participate in quiesce operations
[Blueprint Extender: 1] BlueprintContainerImpl         INFO  Bundle camelmusic is waiting for namespace handlers [http://camel.apache.org/schema/blueprint]
[Blueprint Extender: 1] BlueprintCamelContext          INFO  Apache Camel 2.12.0.redhat-610379 (CamelContext: blueprintContext) is starting
[Blueprint Extender: 1] ManagedManagementStrategy      INFO  JMX is enabled
[Blueprint Extender: 1] BlueprintCamelContext          INFO  AllowUseOriginalMessage is enabled. If access to the original message is not needed, then its recommended to turn this option off as it may improve performance.
[Blueprint Extender: 1] BlueprintCamelContext          INFO  StreamCaching is not in use. If using streams then its recommended to enable stream caching. See more details at http://camel.apache.org/stream-caching.html
[Blueprint Extender: 1] FileEndpoint                   INFO  Endpoint is configured with noop=true so forcing endpoint to be idempotent as well
[Blueprint Extender: 1] FileEndpoint                   INFO  Using default memory based idempotent repository with cache max size: 1000
[Blueprint Extender: 1] BlueprintCamelContext          INFO  Route: CamelMusicRoute started and consuming from: Endpoint[file://c:/mymusic?charset=utf-16&delay=3000&exclude=.*.%28jpg%7CJPG%7Cgif%7CGIF%7Cdoc%7CDOC%7Cpdf%7CPDF%7Cavi%7CAVI%7CMpg%7Cmpg%7Cdb%7CDB%7Cini%7CINI%7Ctxt%7CTXT%7Cm3u%7CM3U%7Cmpeg%7CMPEG%29&noop=true&recursive=true]
[Blueprint Extender: 1] BlueprintCamelContext          INFO  Total 1 routes, of which 1 is started.
[Blueprint Extender: 1] BlueprintCamelContext          INFO  Apache Camel 2.12.0.redhat-610379 (CamelContext: blueprintContext) started in 0.451 seconds
[ thread #0 - file://c:/mymusic] CamelMusicRoute                INFO  File name: c:\mymusic\Adventures Beyond The Ultraworld\The Orb-Lfc - Cumunolibus Mix.mp3 was read at 2014-10-21 18-25-45 and placed onto the route.
[ thread #0 - file://c:/mymusic] CamelMusicRoute                INFO  File name: c:\mymusic\Adventures Beyond The Ultraworld\The Orb-Lfc - Tsunami One.mp3 was read at 2014-10-21 18-25-45 and placed onto the route.
[ thread #0 - file://c:/mymusic] CamelMusicRoute                INFO  File name: c:\mymusic\Adventures Beyond The Ultraworld\The Orb-Little 

In the next part of the tutorial I will deal with the JAudioTagger processor we will need to complete this route. There will be several techniques I use in creating this processor to make sure it is ready to scale to absurd levels and thus you will have to read the next part as I don't want to bombard you with all information at once. Besides this part is already extremely long and I need a break. Till next time.

No comments:

Post a Comment