问题描述
已知时间戳与对应的值,需要根据时间戳找到缺失的点,然后进行值的填充。
例如:
源码
<!-- https://mvnrepository.com/artifact/org.apache.commons/commons-math3 -->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
<version>3.6.1</version>
</dependency>
引入相应的依赖包
import java.util.LinkedList;
import java.util.List;
import org.apache.commons.math3.analysis.interpolation.LinearInterpolator;
import org.apache.commons.math3.analysis.polynomials.PolynomialSplineFunction;
缺失值填充的源码:
/**
* 根据时间戳进行填充value
* @param timestamps 时间戳数组
* @param values 对应的值
* @param interval 时间戳
* @return 填充后的list
*/
public static List<Double> insertMissingByTimestamps(List<Long> timestamps,
List<Double> values,
Long interval) {
// 检查问题
if (timestamps == null || values == null ||
timestamps.size() == 0 || values.size() == 0) {
return null;
}
// 准备好线性插值
LinearInterpolator interp = new LinearInterpolator();
double[] timestampsAry = timestamps.stream().mapToDouble(d -> d).toArray();
double[] valuesAry = values.stream().mapToDouble(d -> d).toArray();
PolynomialSplineFunction insertMissingFunc = interp.interpolate(timestampsAry, valuesAry);
// 计算正常情况下有多少个点
final int normalSize = countMissingPoints(timestamps, interval);
long current = timestamps.get(0);
List<Double> target = new LinkedList<>();
// 填充
for (int i=0; i<timestamps.size() && target.size() < normalSize; ) {
// 如果存在,则添加
if (current == timestamps.get(i)) {
target.add(values.get(i));
i++;
} else {
// 如果不存在,则插值
double value = insertMissingFunc.value(current);
target.add(value);
}
current += interval;
}
return target;
}
测试用例
double[] x = { 1, 3, 4, 7, 8, 9, 10};
double[] y = { 1, 3, 4, 7, 8, 9, 10};
LinearInterpolator interp = new LinearInterpolator();
PolynomialSplineFunction f = interp.interpolate(x, y);
System.out.println("Piecewise functions:");
Arrays.stream(f.getPolynomials()).forEach(System.out::println);
double value = f.value(5);
double value2 = f.value(6);
System.out.println("x = 3, y = " + f.value(3));
System.out.println("x = 6, y = " + f.value(6));
System.out.println("x = 7, y = " + f.value(7));
测试输出内容:
1 + x
2 + x
4 + x
5 + x
8 + x
x = 3, y = 3.0
x = 6, y = 6.0
x = 7, y = 7.0
1
2
3
4
5
6
7
8
注意:如果是边界外的是不能计算的。比如这个题目中不能计算 f.value(10) 。
例子二:
import java.util.Arrays;
//多重插补法是一种处理缺失数据的方法,通过对缺失数据周围的数据进行插值,从而填补缺失值。下面是一个基于Java的简单的多重插补法示例
import org.apache.commons.math3.analysis.interpolation.LinearInterpolator;
import org.apache.commons.math3.analysis.interpolation.SplineInterpolator;
import org.apache.commons.math3.analysis.polynomials.PolynomialSplineFunction;
public class MultipleImputation {
public static void main(String[] args) {
double[] data = {1.2, 3.4, Double.NaN, 5.6, 7.8, Double.NaN, 9.0};
// 使用线性插值法进行缺失值填补
LinearInterpolator linearInterpolator = new LinearInterpolator();
PolynomialSplineFunction function = linearInterpolator.interpolate(getDataIndices(data), getDataValues(data));
for (int i = 0; i < data.length; i++) {
if (Double.isNaN(data[i])) {
data[i] = function.value(i);
}
}
// 使用样条插值法进行缺失值填补
SplineInterpolator splineInterpolator = new SplineInterpolator();
function = splineInterpolator.interpolate(getDataIndices(data), getDataValues(data));
for (int i = 0; i < data.length; i++) {
if (Double.isNaN(data[i])) {
data[i] = function.value(i);
}
}
System.out.println(Arrays.toString(data));
}
private static double[] getDataIndices(double[] data) {
double[] indices = new double[data.length];
for (int i = 0; i < data.length; i++) {
indices[i] = i;
}
return indices;
}
private static double[] getDataValues(double[] data) {
double[] values = new double[data.length];
for (int i = 0; i < data.length; i++) {
values[i] = data[i];
}
return values;
}
}