这是 git diff 后的效果,感觉挺简单的,不就是 比较新旧版本,新增了就用 "+" 显示新加一行,删除了就用 "-" 显示删除一行,修改了一行就用 "-"、"+" 显示将旧版本中的该行干掉了并且新版本中增加了一行,即使用 "删除" + "新增" 操作代替 "修改" 操作,然后就用
然后我们写的测试代码如下:
import com.goldwind.ipark.common.util.MyStringUitls;
import org.apache.commons.text.similarity.LevenshteinDistance;
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.List;
public class MyDiffTest {
public static void main(String[] args) {
List<String> lines_old = loadTxtFile2List("C:\\E\\javaCode\\git\\outer_project\\guodiantou\\gdtipark-user-servie\\src\\main\\java\\com\\goldwind\\ipark\\base\\test\\DemoClass1.java" );
List<String> lines_new = loadTxtFile2List("C:\\E\\javaCode\\git\\outer_project\\guodiantou\\gdtipark-user-servie\\src\\main\\java\\com\\goldwind\\ipark\\base\\test\\DemoClass2.java");
// lines1的起始行和 lines2 的起始行做映射
// 扫描旧版本中的每一行
int size = lines_old.size();
for( int i=0;i<size;i++ ){
// 从新版本中找该行
String line_old = lines_old.get(i);
String line_new = lines_new.get(i);
// 如果发现版本中中该行的数据变了,那么提示删除了旧的行,添加了新的行
if( line_new.equals( line_old ) ){
System.out.println( line_old );
}else {
System.out.println( "- " + line_old );
System.out.println( "+ " + line_new );
}
}
// xxxx xxxx1 -xxxx
// yyyy yyyy +xxxx1
// xxxxxx xxxxxx xxxxxx
// zzzz zzzz zzzz
}
private static List<String> loadTxtFile2List(String filePath) {
BufferedReader reader = null;
List<String> lines = new ArrayList<>();
try {
reader = new BufferedReader(new FileReader(filePath));
String line = reader.readLine();
while (line != null) {
// System.out.println(line);
lines.add( line );
line = reader.readLine();
}
return lines;
} catch (Exception e) {
e.printStackTrace();
return null;
} finally {
if (reader != null) {
try {
reader.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
}
其中用到的2个新旧版本的文本如下:
DemoClass1.java:
import com.goldwind.ipark.common.exception.BusinessLogicException;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.text.similarity.LevenshteinDistance;
@Slf4j
public class DemoClass1 {
private static final LevenshteinDistance LEVENSHTEIN_DISTANCE = LevenshteinDistance.getDefaultInstance();
public static String null2emptyWithTrim( String str ){
if( str == null ){
str = "";
}
str = str.trim();
return str;
}
public static String requiredStringParamCheck(String param, String paramRemark) {
param = null2emptyWithTrim( param );
if( param.length() == 0 ){
String msg = "操作失败,请求参数 \"" + paramRemark + "\" 为空";
log.error( msg );
throw new BusinessLogicException( msg );
}
return param;
}
public static double calculateSimilarity( String str1,String str2 ){
int distance = LEVENSHTEIN_DISTANCE.apply(str1, str2);
double similarity = 1 - (double) distance / Math.max(str1.length(), str2.length());
System.out.println("相似度:" + similarity);
return similarity;
}
}
DemoClass2.java:
import com.goldwind.ipark.common.exception.BusinessLogicException;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.text.similarity.LevenshteinDistance;
@Slf4j
public class DemoClass2 {
private static final LevenshteinDistance LEVENSHTEIN_DISTANCE = LevenshteinDistance.getDefaultInstance();
private static final LevenshteinDistance LEVENSHTEIN_DISTANCE1 = LevenshteinDistance.getDefaultInstance();
private static final LevenshteinDistance LEVENSHTEIN_DISTANCE2 = LevenshteinDistance.getDefaultInstance();
public static String null2emptyWithTrim( String str ){
if( str == null ){
str = "";
}
str = str.trim();
return str;
}
public static String requiredStringParamCheck(String param, String paramRemark) {
param = null2emptyWithTrim( param );
if( param.length() == 0 ){
String msg = "操作失败,请求参数 \"" + paramRemark + "\" 为空";
log.error( msg );
throw new BusinessLogicException( msg );
}
return param;
}
public static double calculateSimilarity( String str1,String str2 ){
int distance = LEVENSHTEIN_DISTANCE.apply(str1, str2);
double similarity = 1 - (double) distance / Math.max(str1.length(), str2.length());
System.out.println("相似度:" + similarity);
return similarity;
}
}
DemoClass2.java 相较于 DemoClass1.java 的区别是 "public class" 后面的类名不同,"private static final LevenshteinDistance LEVENSHTEIN_DISTANC..." 多复制了2行并改了名称,然后运行后显示差别如下:
import com.goldwind.ipark.common.exception.BusinessLogicException;
import lombok.extern.slf4j.Slf4j;
import org.apache.commons.text.similarity.LevenshteinDistance;
@Slf4j
- public class DemoClass1 {
+ public class DemoClass2 {
private static final LevenshteinDistance LEVENSHTEIN_DISTANCE = LevenshteinDistance.getDefaultInstance();
-
+ private static final LevenshteinDistance LEVENSHTEIN_DISTANCE1 = LevenshteinDistance.getDefaultInstance();
- public static String null2emptyWithTrim( String str ){
+ private static final LevenshteinDistance LEVENSHTEIN_DISTANCE2 = LevenshteinDistance.getDefaultInstance();
- if( str == null ){
+
- str = "";
+ public static String null2emptyWithTrim( String str ){
- }
+ if( str == null ){
- str = str.trim();
+ str = "";
- return str;
+ }
- }
+ str = str.trim();
-
+ return str;
- public static String requiredStringParamCheck(String param, String paramRemark) {
+ }
- param = null2emptyWithTrim( param );
+
- if( param.length() == 0 ){
+ public static String requiredStringParamCheck(String param, String paramRemark) {
- String msg = "操作失败,请求参数 \"" + paramRemark + "\" 为空";
+ param = null2emptyWithTrim( param );
- log.error( msg );
+ if( param.length() == 0 ){
- throw new BusinessLogicException( msg );
+ String msg = "操作失败,请求参数 \"" + paramRemark + "\" 为空";
- }
+ log.error( msg );
- return param;
+ throw new BusinessLogicException( msg );
- }
+ }
-
+ return param;
- public static double calculateSimilarity( String str1,String str2 ){
+ }
- int distance = LEVENSHTEIN_DISTANCE.apply(str1, str2);
+
- double similarity = 1 - (double) distance / Math.max(str1.length(), str2.length());
+ public static double calculateSimilarity( String str1,String str2 ){
- System.out.println("相似度:" + similarity);
+ int distance = LEVENSHTEIN_DISTANCE.apply(str1, str2);
- return similarity;
+ double similarity = 1 - (double) distance / Math.max(str1.length(), str2.length());
- }
+ System.out.println("相似度:" + similarity);
- }
+ return similarity;
为啥???
如上两张图片,旧版本的第10行和新版本的第10行对应,从直观上看新版本的第11、12行是在旧版本的第10行和第11行之间插进去的,但是程序并不这么认为,它会认为将旧版本的第11行的空白行修改为了新版本的 “private static final LevenshteinDistance LEVENSHTEIN_DISTANCE1 = LevenshteinDistance.getDefaultInstance();” 为什么我们人眼会这么直观的感觉到 新版本的 第11、12行时插进去的,因为我们比较了新旧版本的第7、8、9、10行都差不多,旧版本的11~27行和新版本的 13~29行都差不多,所以自然而然的认为新版本的11、12行是直接插进去的,那么现在我们就来算法实现吧!
todo
todo
todo