xml - Label:- XMLContent De-duplication -
question 1---> working on project in translate english content other 17 languages. reduce translation cost using md5 hashcode , based on results decide whether topic new(master) or translated earlier(obselete). logic complicated , want reduce complexity level. using content management system filenet , way older..:) need best suggestion content de-duplication apart md5 hashing
note :- topic means xml file images , rendered via xslt , not dita standard.
question 2--->
what best alternative render non-standard xml file or not dita standard xml file on ui html or pdf.?
thanks in adavance ...waiting best suggestions.
question 1
i recommend not rely on hashes or time stamps, depends on environment. if refactor variables, change indentation add/remove comments, etc. not change content , should not trigger translation process, may rely on metadata trigger semi-automatic process. further on, use diffing mechanism compare current version of document earlier one.
question 2
as first question, 1 hard answer without knowing environment, too. smarter firstly convert files dita or markdown , use dita-ot or markdown processor further transformation.
Comments
Post a Comment