2020年 InfoWorld的 最佳开源奖

2020年 InfoWorld的 最佳开源奖

在2020年的25位获奖者中,我们将发现一系列常见的前沿项目工具,用于构建更好的web应用程序、更精确的机器学习模型、更清晰的数据可视化、更灵活的工作流、更快和更可扩展的数据库和分析,等等。

Hasura

2020年 InfoWorld的 最佳开源奖

现代应用程序是用GraphQL编写的。每个人都听说过这样的故事:使用GraphQL进行优化 ,而不必不发送1000个REST调用,实际上GraphQL在 安全性、订阅和实时查询扩展等方面根据优势。Hasura为构造、运行和配置GraphQL查询提供了有用的图形工具。
此外,Hasura的构建考虑到了PostgreSQL和PostgreSQL兼容的数据库(现在也支持MySQL)。如果JavaScript/RestApi时代属于MongoDB和NoSQL,那么GraphQL时代属于PostgreSQL和分布式SQL。Hasura是从这些现代应用程序趋势中涌现出来的最好的开源GraphQL技术堆栈之一。

Prisma

2020年 InfoWorld的 最佳开源奖

TypeScript应用程序有很多orm,但是Prisma是最适合开发人员的,具有SQL查询的自动完成功能。好吧,从技术上讲,“开发者不认为Prisma是ORM”。它的设计考虑了API开发,包括gRPC和GraphQL。
Prisma可以在PostgreSQL、MySQL和SQLite数据库中开箱即用。有一个visualstudio代码扩展,以及您所期望的现代数据库API和映射解决方案的所有功能。想要从对象和查询的角度考虑问题,并且具有类型安全性和所有固定功能的开发人员可能会考虑Prisma。

Jekyll

2020年 InfoWorld的 最佳开源奖

Jekyll是最好的新静态站点生成器之一,它将我们的信息打包成一个单独的web页面集合,可以推送到内容交付网络。没有数据库。没有人造定制。Jekyll只需要把你的文本放进一个模板中,就完成了。您可以获得良好的模板与在边缘存储静态文件的速度相混合所带来的所有灵活性。

Gatsby

2020年 InfoWorld的 最佳开源奖

很多人喜欢渐进式web应用开发的思路,他们也接受了React的复杂的多面板方法,但是仍然需要大量的工作来开发其他功能。Gatsby是一个位于React之上的框架,它利用大量插件来嵌入shoppify商店之类的大东西,或者SON数据源之类的小东西,以及2000多个其他模块来完成任务。
Gatsby项目的主要目标之一是提供快速的web页面,它通过利用良好的缓存、静态页面生成和基于边缘的CDN数据源来实现这一目标。该项目宣称,Gatsby生成的静态网页比其他静态框架快2.5倍。

Drupal

2020年 InfoWorld的 最佳开源奖

Drupal很早就存在可。Dries Buytaert在2001年发布了第一个开源版本,以帮助开发人员建立由多个字段组成的数据丰富的节点的网站。它是一个丰富的框架,使之更适合于包含表格数据。现在它似乎又以全新的方式出现了,因为Drupal的代码已经过整理、重构和重写。在6月份发布Drupal 9,它是一个完全现代的PHPWeb应用程序,构建在诸如Composer、Symphony和Twig等PHP工具之上。

Vulkan

2020年 InfoWorld的 最佳开源奖

Vulkan是新一代图形和计算API,它提供高效、跨平台访问现代GPU的能力,这包括用于pc、移动电话和嵌入式平台的各种设备的GPU。Vulkan API支持游戏、移动和工作站开发。它是OpenGL标准的继承者。与OpenGL(本质上是一个图形API)相比,Vulkan更像是一个GPU API。有来自AMD、Arm、Broadcom、Imagination、Intel、Nvidia、Qualcomm和VeriSilicon的Vulkan驱动程序,以及适用于Windows、Linux、macOS/iOS和Android的Vulkan SDK。最著名的游戏引擎现在也n支持Vulka

Redis

2020年 InfoWorld的 最佳开源奖

Redis是一个集速度、弹性、可伸缩性和灵活性于一体的NoSQL内存数据结构存储,可以用作数据库、缓存和消息代理。Redis功能有内置的复制、Lua脚本、LRU收回、事务和不同级别的磁盘持久性支持。它通过Redis Sentinel和Redis集群自动分区提供高可用性。Redis提供的数据库延迟通常在1毫秒以下。
Redis的核心数据模型是key-value,但支持多种类型的值:字符串、列表、集合、排序集、哈希、流、超日志和位图。Redis还通过radius查询和流支持地理空间索引。Redis 6增加了几个主要功能,最重要的是线程I/O,这使得速度提高了2倍。redis6还添加了访问控制列表的功能,增加了用户的概念,并允许开发人员编写更安全的代码。

Apache Airflow

2020年 InfoWorld的 最佳开源奖

套用阿甘的话来说,事情总会发生,但有时候事情必须按一定的顺序发生,然后引发其他事情的发生。换句话说,你需要一个工作流。如果你最初需要的是Python和airberry的工作流,那么你可能需要一个由Python驱动的工作流。
Airflow允许我们将工作流构造为有向无环图(DAG)。它甚至允许我们构造动态工作流。与其他需要您将工作流转换为XML或其他元数据语言的工具不同,Airflow遵循“配置即代码”的原则,允许您用Python脚本编写。

Apache Superset

2020年 InfoWorld的 最佳开源奖

你有没有给Tableau定价,感觉心跳加速?我的意思是,你只需要一个仪表板,但不知何故这比一些数据库要贵。更重要的是,你有没有想过要一个Tableau做不到或做得很好的可视化效果?请参考Apache Superset,这是Airbnb的另一个产品
Superset是一个可视化工具包,它结合了SQL IDE、数据浏览器、拖放式仪表板编辑器和用于构建自定义可视化的插件,它可以从许多关系数据库和非关系数据库制作仪表板,并且可以连接到Apache Drill和ApachedDuid。最棒的是,Superset是支持本地部署的、容器化、横向扩展等等。

JanusGraph

2020年 InfoWorld的 最佳开源奖

图形数据库总是让系统有点消化不良。当然也有图形查询和图形本身的问题。但是图形真的是存储数据的正确方法吗?Neo4j 的支持者声称它非常高效,以至于我们都不需要现代分布式存储、分片和分区。但是,如果我们运行的是一个非常大的数据集,我们很快就会发现,也许图形分析是有效的,d但是存储图则不一定是高效的。
JanusGraph允许我们运行Gremlin图形查询,可以将实际数据存储在分布式数据库中,如Cassandra、Yugabyte、scyllab、HBase或Google的Bigtable。与Neo4j和其他“原生图”数据库一样,JanusGraph是支持事务和索引的,它适于图形化的OLTP使用和OLAP分析用途。如果你做的是真正的大图形,JanusGraph可能是正确的方法。

Apache Druid

2020年 InfoWorld的 最佳开源奖

The world of analytics is changing. Where we used to batch load everything into a giant MPP system and wait for long-running queries, now a large subset of analytics is done in real-time as events happen and get streamed. After all, what happened yesterday or a week ago might as well have been a lifetime ago.
Apache Druid is a distributed column store database that provides both low-latency ingest of data and low-latency queries on top. This is a BI or OLAP style database with out-of-the-box integration with message buses like Apache Kafka and data sources like HDFS. Part data warehouse and part search system, Druid is capable of handling massive amounts of data and designed for the cloud era.

Apache Arrow

2020年 InfoWorld的 最佳开源奖

With the recent 1.0 release, Apache Arrow has gone from strength to strength, bringing its in-memory columnar format to multiple languages such as C/C++, Go, Java/JVM, Python, JavaScript, Matlab, R, Ruby, and Rust. While Arrow is not something that you’ll explicitly go out and download, you’ll find it at the core of many of the big data and machine learning projects you probably use—Apache Spark,Dask,Amazon Athena, and TensorFlow to mention just a few.
The Arrow project is now turning its attention to communication across machines as well as in-memory. Apache Arrow Flight is its new, general-purpose, client-server framework designed to simplify high performance transport of large data sets over network interfaces. Expect to find Flight powering cluster data transfers in some of your distributed computing applications in the year to come.

Argo

2020年 InfoWorld的 最佳开源奖

There are many open source workflow engines out there, such as Apache Oozie and Apache Airflow , but unlike those stalwarts, Argo has been designed from the ground up to work with Kubernetes . Originally developed by Intuit, Argo fits right in with your deployments and can directly interact with Kubernetes resources, as well as Docker-led custom steps.
Over the past year, Argo has added new features for templating and resiliency, and now is probably one of the best ways to handle workflows within your clusters. The Argo project includes more features, allowing workflows to be triggered by events and even a CI/CD system, while the well-thought-out modularity means you can use as much or as little of the Argo ecosystem as you desire.

Seldon Core

2020年 InfoWorld的 最佳开源奖

Creating a good machine learning model is hard, but that’s only the first part of the story. Deploying, monitoring, and maintaining is likely to be even more important to the success of your machine learning strategy in the long run.
A toolkit for deploying models on Kubernetes, Seldon Core offers a boatload of functionality to help you along that journey — a multilingual API that ensures that a model written in Java can be deployed with a PyTorch model in exactly the same way, the ability to construct arbitrary graphs of different models, support for routing requests to A/B or multi-armed bandit experiments, and integration with Prometheus for those all-important metrics.
And it’s all built upon Kubernetes and standard components like Istio or Ambassador . You can expect to find Seldon Core at the heart of many companies’ model deployment strategies for years to come.

Optuna

2020年 InfoWorld的 最佳开源奖

If you ask a machine learning expert for advice on hyperparameter tuning, they will likely point you in the direction of Hyperopt. But maybe you should give Preferred Networks’ Optuna a whirl instead. The new Optuna 2.0 release comes complete with out-of-the-box integration with TensorFlow,Scikit-learn,Apache MXNet, and PyTorch ,with further specific framework support for Keras,PyTorch Lightning,PyTorch Ignite, and Fastai.

Offering faster samplers and hyperband pruning, Optuna can significantly reduce the time required to discover performant parameter optima, and you can get all of this with just a few lines of code. In addition, the framework is incredibly simple to extend for scenarios that step outside of the supplied integrations.

K9s

2020年 InfoWorld的 最佳开源奖

K9s is the SRE’s best friend, and you’ll often find it hidden away in one of their terminals. A Swiss Army knife of monitoring Kubernetes clusters, K9s wraps all of that kubectl functionality in a constantly updating fashion. You can see all of your pods at a glance, or drill down into descriptions and logs with a single keystroke.
Not only that, but K9s allows you to edit resources, shell into pods, and even run benchmarks—all from the same command-line interface. Paired with an extensive plug-in system and skin support, it’s the hacker-friendly interface to Kubernetes that you’ll soon wonder how you managed without.

KubeDirector

2020年 InfoWorld的 最佳开源奖

KubeDirector is an open source project designed to make it easy to run stateful applications on Kubernetes clusters. Normally this requires writing custom operators, a task requiring high levels of expertise in both the application and Kubernetes.
KubeDirector is implemented as a Kubernetes operator for long-lived, watchful orchestration of stateful applications. At its core it models a domain of applications, allowing the user to specify service endpoints, persistent directories, and anything that must remain constant between instantiations.
There is an example application catalog that includes Apache Kafka, Apache Spark, TensorFlow, Cloudera, and Cassandra. If you have a Kubernetes cluster available, you can get started with any of these applications in just a few minutes.

Bottlerocket

2020年 InfoWorld的 最佳开源奖

A Linux derivative purpose-built to run containers on large-scale Kubernetes clusters, Bottlerocket is more of a software appliance than an OS. Management is accomplished with a REST API, a command-line client, or a web console. Updates are managed with a Kubernetes operator in a single step, using the “two partition pattern,” and rolled back in case of error. Security is enhanced by presenting a minimal attack surface and enforcing inter-container isolation with cgroups, kernal namespaces, and eBPF. SELinux is enabled by default.
Although designed for running in Amazon EKS, Bottlerocket will also run on premises. The source code is on GitHub and the OS is easily buildable. Core components are written in Rust, and Amazon makes it easy to add your own operators or control containers. Delivering high performance, based on standard Linux, and supported by AWS, Bottlerocket is a compelling choice for both AWS devotees and customers implementing a multicloud strategy.

SPIFFE

2020年 InfoWorld的 最佳开源奖

SPIFFE (Secure Production Identity Framework For Everyone) is a specification for “cloud native security,” where cloud native is defined as an environment where hosts and processes are created and destroyed frequently. SPIFFE solves the identity problem inherent in large container clusters: how to identify a service, issue the service credentials, and verify those credentials with providers.

Within a single cloud environment, Kerberos and OAuth work nicely for identity management, but things get a bit tricky with true hybrid and multicloud setups. To solve this problem SPIFFE binds identities to workload entities instead of specific hosts. This means that as containers come and go, they can maintain the same identity.
Additionally, SPIFFE assumes a zero-trust network and requires neither keys nor passwords to establish identity. This prevents leakage of secrets because authentication information doesn’t need to be injected into the system at any point. SPIFFE can work with existing identity providers like OAuth.

Lem

2020年 InfoWorld的 最佳开源奖

There have been attempts to modernize Emacs , the 40-year-old standby of “hard core” programmers, but all are fundamentally limited by the underlying engine: Emacs Lisp. Lisp is a great language for an editor, and was the basis for Zmacs, the superb editor of the Lisp Machine, but Emacs Lisp missed out on the millions of DARPA dollars spent on the Common Lisp ANSI specification process that created a language suitable for deploying military-grade applications.

Lem is a greenfield rewrite of Emacs using Common Lisp. Common Lisp gives Lem access to GUI libraries for a modern graphical experience (there’s an alpha version of an Electron GUI), seamless calls to C/C++, and access to a vast array of third-party libraries. Already there is a critical mass of contributors, financial backing, and editing modes for 28 languages. Emacs hackers will no doubt feel right at home with Lem.

Chapel

2020年 InfoWorld的 最佳开源奖

As data sets get larger and larger, concurrency, parallelism, and distribution become increasingly important when building predictive models, and no one does this better than the supercomputing crowd. A High Performance Computing program is a low-level programming task that might consist of C/C++ or Fortran code, some shell scripts, OpenMP/MPI, and a high level of skill to put it all together.
Chapel makes it easier by providing higher level language constructs for parallel computing that are similar to languages like Python or Matlab. All of the things that make HPC a hard nut to ***** in C are handled at a higher level in Chapel—things like creating a distributed array spanning thousands of nodes, a namespace available on any node, and concurrency and parallel primitives.
HPC has always been somewhat of a niche. Partly because the need previously wasn’t there, and partly because the skills were rare. Chapel brings the possibility of running machine learning algorithms at very large scale to the general software programmer. If nothing else, there’s value for everyone in understanding the ideas and concepts that are surfaced elegantly in this language.

Apromore

2020年 InfoWorld的 最佳开源奖

Whether you’re trying to improve the efficiency of a longstanding business process, or rolling out a new compliance monitoring service, the first step toward success is gaining an accurate view into your IT systems. Apromore process discovery tools provide this insight by analyzing KPIs from your IT back end, ingesting BPMN activities and flows, and creating a dependency graph of your operations.
Apromore’s browser-based, visual dashboards display an animated process map of resources, activities, timestamps, and domain-specific attributes to reveal statistical and temporal insights. Easy filtering gets you the details you need. Apromore even lets you shift focus between application workflows and the human capital implementing them to identify outliers and bottlenecks.
You will need to pay up for the Enterprise Edition to gain access to the application features and BI connector plug-ins you might need. But the metrics and visual insights provided by Apromore will do wonders to inform your change impact analyses and end-to-end optimization efforts in your enterprise workflows.

Sourcegraph

2020年 InfoWorld的 最佳开源奖

Enterprise app dev projects have become too complex to be managed by IDE alone. Modern codebases and their dependencies span multiple programming languages, repositories, and geographically distributed teams, creating a challenge just to index the code, let alone search or refactor it.
With plug-ins for major IDEs and web browsers, Sourcegraph integrates into your development workflow to unify the search process. Using regular expressions and language-specific filters, you get a quick and complete picture of the entire codebase enhanced with code intelligence, comprehensive navigation, and hover tooltips that provide references and definitions right inside your browser.
Sourcegraph supercharges the dated IDE with a new class of code navigation, review, and intelligence tools previously reserved for Google and Facebook engineers. If your development team spends any amount of time searching for code, reviewing code, or wondering where code is being reused, you’ll want to explore the power of Sourcegraph.

QuestDB 时间序列数据库

2020年 InfoWorld的 最佳开源奖

High-performance time series databases are often closed-source products that not only can be costly to maintain, but also require learning proprietary query languages. Not QuestDB

A free open source database developed for fast processing of time series data and events, QuestDB is queried using familiar SQL(https://www.infoworld.com/article/3219795/what-is-sql-the-first-language-of-data-analysis.html) , along with time series extensions for temporal aggregations. And yet its Java-based query engine delivers blazingly fast response times with minimal latency.

To deliver its impressive query performance, QuestDB takes advantage of a custom storage engine, modified Google Swiss Tables, SIMD instructions, parallel execution queuing, and pipeline prefetch optimizations. Its onboard web console provides a schema explorer, a code editor for interactive queries, and some basic table and visualization tools.
QuestDB is still a work in progress. Not all queries have been optimized yet, and the SQL dialect is still being fleshed out. But when you get breakthrough time series query performance together with SQL support, who cares about a few wrinkles?
Licensed under Apache 2.0, QuestDB runs on Linux, MacOS, and Windows and makes packages available for Docker and Homebrew.

Open Policy Agent

2020年 InfoWorld的 最佳开源奖

Authorization policy enforcement is typically done manually using hard-coded rules on an ad hoc basis, essentially reinventing the wheel for every application and service. Such a brittle approach inevitably leads to fragmented policy authorization point solutions that become impossible to maintain or audit.
Open Policy Agent provides a general-purpose authorization engine that decouples policy decision-making from application-level enforcement. OPA accepts a series of JSON attributes, evaluates them against the policies and data within its purview, and responds to the application with a Yes or No decision that gets enforced by the caller.
OPA can be run as a daemon or integrated directly into your service as a library. It is an excellent fit for use cases like microservices, service meshes, API authorization, and Kubernetes admission control, but could just as easily be extended for use in SaaS delivery models, for example.
Combining flexible enforcement with a declarative policy language that simplifies policy creation, OPA returns control over a wide range of technologies back to administrators by treating policy like code that can be managed uniformly and logically across the stack—from bare metal to cloud.